Write Yourself a Scheme in 48 Hours/Error Checking and Exceptions

Write Yourself a Scheme in 48 Hours
← Evaluation, Part 1	Error Checking and Exceptions	Evaluation, Part 2 →

Currently, there are a variety of places within the code where we either ignore errors or silently assign "default" values like #f or 0 that make no sense. Some languages – like Perl and PHP – get along fine with this approach. However, it often means that errors pass silently throughout the program until they become big problems, which means rather inconvenient debugging sessions for the programmer. We'd like to signal errors as soon as they happen and immediately break out of execution.

First, we need to import Control.Monad.Except to get access to Haskell's built-in error functions:

import Control.Monad.Except

On Debian based systems this needs libghc6-mtl-dev installed.

Then, we should define a data type to represent an error:

data LispError = NumArgs Integer [LispVal]
               | TypeMismatch String LispVal
               | Parser ParseError
               | BadSpecialForm String LispVal
               | NotFunction String String
               | UnboundVar String String
               | Default String

This is a few more constructors than we need at the moment, but we might as well foresee all the other things that can go wrong in the interpreter later. Next, we define how to print out the various types of errors and make LispError an instance of Show:

showError :: LispError -> String
showError (UnboundVar message varname)  = message ++ ": " ++ varname
showError (BadSpecialForm message form) = message ++ ": " ++ show form
showError (NotFunction message func)    = message ++ ": " ++ show func
showError (NumArgs expected found)      = "Expected " ++ show expected 
                                       ++ " args; found values " ++ unwordsList found
showError (TypeMismatch expected found) = "Invalid type: expected " ++ expected
                                       ++ ", found " ++ show found
showError (Parser parseErr)             = "Parse error at " ++ show parseErr

instance Show LispError where show = showError

Then we define a type to represent functions that may throw a LispError or return a value. Remember how parse used an Either data type to represent exceptions? We take the same approach here:

type ThrowsError = Either LispError

Type constructors are curried just like functions, and can also be partially applied. A full type would be Either LispError Integer or Either LispError LispVal, but we want to say ThrowsError LispVal and so on. We only partially apply Either to LispError, creating a type constructor ThrowsError that we can use on any data type.

Either is yet another instance of a monad. In this case, the "extra information" being passed between Either actions is whether or not an error occurred. Bind applies its function if the Either action holds a normal value, or passes an error straight through without computation. This is how exceptions work in other languages, but because Haskell is lazily-evaluated, there's no need for a separate control-flow construct. If bind determines that a value is already an error, the function is never called.

The Control.Monad.Except library automatically gives the Either monad two other functions besides the standard monadic ones:

throwError, which takes an Error value and lifts it into the Left (error) constructor of an Either
catchError, which takes an Either action and a function that turns an error into another Either action. If the action represents an error, it applies the function, which you can use to, e.g. turn the error value into a normal one via return or re-throw as a different error.

In our program, we'll be converting all of our errors to their string representations and returning that as a normal value. Let's create a helper function to do that for us:

trapError action = catchError action (return . show)

The result of calling trapError is another Either action which will always have valid (Right) data. We still need to extract that data from the Either monad so it can be passed around to other functions:

extractValue :: ThrowsError a -> a
extractValue (Right val) = val

We purposely leave extractValue undefined for a Left constructor, because that represents a programmer error. We intend to use extractValue only after a catchError, so it's better to fail fast than to inject bad values into the rest of the program.

Now that we have all the basic infrastructure, it's time to start using our error-handling functions. Remember how our parser had previously just returned a string saying "No match" on an error? Let's change it so that it wraps and throws the original ParseError:

readExpr :: String -> ThrowsError LispVal
readExpr input = case parse parseExpr "lisp" input of
     Left err -> throwError $ Parser err
     Right val -> return val

Here, we first wrap the original ParseError with the LispError constructor Parser, and then use the built-in function throwError to return that in our ThrowsError monad. Since readExpr now returns a monadic value, we also need to wrap the other case in a return function.

Next, we change the type signature of eval to return a monadic value, adjust the return values accordingly, and add a clause to throw an error if we encounter a pattern that we don't recognize:

eval :: LispVal -> ThrowsError LispVal
eval val@(String _) = return val
eval val@(Number _) = return val
eval val@(Bool _) = return val
eval (List [Atom "quote", val]) = return val
eval (List (Atom func : args)) = mapM eval args >>= apply func
eval badForm = throwError $ BadSpecialForm "Unrecognized special form" badForm

Since the function application clause calls eval (which now returns a monadic value) recursively, we need to change that clause. First, we had to change map to mapM, which maps a monadic function over a list of values, sequences the resulting actions together with bind, and then returns a list of the inner results. Inside the Error monad, this sequencing performs all computations sequentially but throws an error value if any one of them fails – giving you Right [results] on success, or Left error on failure. Then, we used the monadic "bind" operation to pass the result into the partially applied "apply func", again returning an error if either operation failed.

Next, we change apply itself so that it throws an error if it doesn't recognize the function:

apply :: String -> [LispVal] -> ThrowsError LispVal
apply func args = maybe (throwError $ NotFunction "Unrecognized primitive function args" func)
                        ($ args)
                        (lookup func primitives)

We didn't add a return statement to the function application ($ args). We're about to change the type of our primitives, so that the function returned from the lookup itself returns a ThrowsError action:

primitives :: [(String, [LispVal] -> ThrowsError LispVal)]

And, of course, we need to change the numericBinop function that implements these primitives so it throws an error if there's only one argument:

numericBinop :: (Integer -> Integer -> Integer) -> [LispVal] -> ThrowsError LispVal
numericBinop op           []  = throwError $ NumArgs 2 []
numericBinop op singleVal@[_] = throwError $ NumArgs 2 singleVal
numericBinop op params        = mapM unpackNum params >>= return . Number . foldl1 op

We use an at-pattern to capture the single-value case because we want to include the actual value passed in for error-reporting purposes. Here, we're looking for a list of exactly one element, and we don't care what that element is. We also need to use mapM to sequence the results of unpackNum, because each individual call to unpackNum may fail with a TypeMismatch:

unpackNum :: LispVal -> ThrowsError Integer
unpackNum (Number n) = return n
unpackNum (String n) = let parsed = reads n in 
                           if null parsed 
                             then throwError $ TypeMismatch "number" $ String n
                             else return $ fst $ parsed !! 0
unpackNum (List [n]) = unpackNum n
unpackNum notNum     = throwError $ TypeMismatch "number" notNum

Finally, we need to change our main function to use this whole big error monad. This can get a little complicated, because now we're dealing with two monads (Either (for errors) and IO). As a result, we go back to do-notation, because it's nearly impossible to use point-free style when the result of one monad is nested inside another:

main :: IO ()
main = do
     args <- getArgs
     evaled <- return $ liftM show $ readExpr (args !! 0) >>= eval
     putStrLn $ extractValue $ trapError evaled

Here's what this new function is doing:

args is the list of command-line arguments.
evaled is the result of:
1. taking first argument (args !! 0);
2. parsing it (readExpr);
3. passing it to eval (>>= eval; the bind operation has higher precedence than $);
4. calling show on it within the Error monad. (Note also that the whole action has type IO (Either LispError String), giving evaled type Either LispError String. It has to be, because our trapError function can only convert errors to Strings, and that type must match the type of normal values.)
Caught is the result of:
1. calling trapError on evaled, converting errors to their string representation;
2. calling extractValue to get a String out of this Either LispError String action;
3. printing the results through putStrLn.

Compile and run the new code, and try throwing it a couple errors:

$ ghc -package parsec -o errorcheck [../code/listing5.hs listing5.hs]
$ ./errorcheck "(+ 2 \"two\")"
Invalid type: expected number, found "two"
$ ./errorcheck "(+ 2)"
Expected 2 args; found values 2
$ ./errorcheck "(what? 2)"
Unrecognized primitive function args: "what?"

Some readers have reported that you need to add a --make flag to build this example, and presumably all further listings. This tells GHC to build a complete executable, searching out all dependencies listed in the import statements. The command above works on my system, but if it fails on yours, give --make a try.

Write Yourself a Scheme in 48 Hours
← Evaluation, Part 1	Error Checking and Exceptions	Evaluation, Part 2 →