Haskell/do Notation

From Wikibooks, open books for an open world
Jump to: navigation, search

Among the initial examples of monads, there were some which used an alternative syntax with do blocks for chaining computations. Those examples, however, were not the first time we have seen do: back in Simple input and output we had seen how code for doing input-output was written in an identical way. That is no coincidence: what we have been calling IO actions are just computations in a monad - namely, the IO monad. We will revisit IO soon; for now, though, let us consider exactly how the do notation translates into regular monadic code. Since the following examples all involve IO, we will refer to the computations/monadic values as actions, like in the earlier parts of the book. Still do works with any monad; there is nothing specific about IO in how it works.

Translating the then operator[edit]

The (>>) (then) operator is easy to translate between do notation and plain code, so we will see it first. For example, suppose we have a chain of actions like the following one:

putStr "Hello" >> 
putStr " " >> 
putStr "world!" >> 
putStr "\n"

We can rewrite it in do notation as follows:

do putStr "Hello"
   putStr " "
   putStr "world!"
   putStr "\n"

This sequence of instructions is very similar to what you would see in any imperative language such as C. The actions being chained could be anything, as long as all of them are in the same monad. In the context of the IO monad, for instance, an action might be writing to a file, opening a network connection or asking the user for input. The general way we translate these actions from the do notation to standard Haskell code is:

do action1

which becomes

action1 >>
do action2

and so on until the do block is empty.

Translating the bind operator[edit]

The (>>=) is a bit more difficult to translate from and to do notation, essentially because it involves passing a value, namely the result of an action, downstream in the binding sequence. These values can be stored using the <- notation, and used downstream in the do block.

do x1 <- action1
   x2 <- action2
   action3 x1 x2

x1 and x2 are the results of action1 and action2 (for instance, if action1 is an IO Integer then x1 will be bound to an Integer). They are passed as arguments to action3, whose return type is a third action. The do block is broadly equivalent to the following vanilla Haskell snippet:

action1 >>= \x1 -> action2 >>= \x2 -> action3 x1 x2

The second argument of (>>=) is a function specifying what to do with the result of the action passed as first argument; and so by chaining lambdas in this way we can pass results downstream. Remember that without extra parentheses a lambda extends all the way to the end of the expression, so x1 is still in scope at the point we call action3. We can rewrite the chain of lambdas more legibly as:

action1 >>= \x1 ->
action2 >>= \x2 ->
action3 x1 x2

The fail method[edit]

Above we said the snippet with lambdas was "broadly equivalent" to the do block. It is not an exact translation because the do notation adds special handling of pattern match failures. x1 and x2 when placed at the left of either <- or -> are patterns being matched. Therefore, if action1 returned a Maybe Integer we could write a do block like this...

do Just x1 <- action1
   x2      <- action2
   action3 x1 x2

...and x1 will be bound to an Integer. In such a case, however, what happens if action1 returns Nothing? Ordinarily, the program would crash with an non-exhaustive patterns error, just like the one we get when calling head on an empty list. With do notation, however, failures will be handled with the fail method for the relevant monad. The translation of the first statement done behind the scenes is equivalent to:

action1 >>= f
where f (Just x1) = do x2 <- action2
                       action3 x1 x2
      f _         = fail "..." -- A compiler-generated message.

What fail actually does is up to the monad instance. While it will often just rethrow the pattern matching error, monads which incorporate some sort of error handling may deal with the failure in their own specific ways. For instance, Maybe has fail _ = Nothing; analogously, for the list monad fail _ = [] [1].

All things considered, the fail method is an artefact of do notation. It is better not to call it directly from your code, and to only rely on automatic handling of pattern match failures when you are sure that fail will do something sensible for the monad you are using.

Example: user-interactive program[edit]


We are going to interact with the user, so we will use putStr and getLine alternately. To avoid unexpected results in the output remember to disable output buffering importing System.IO and putting hSetBuffering stdout NoBuffering at the top of your code. Otherwise you can explictly flush the output buffer before each interaction with the user (namely a getLine) using hFlush stdout. If you are testing this code with ghci you don't have such problems.

Consider this simple program that asks the user for his or her first and last names:

nameDo :: IO ()
nameDo = do putStr "What is your first name? "
            first <- getLine
            putStr "And your last name? "
            last <- getLine
            let full = first ++ " " ++ last
            putStrLn ("Pleased to meet you, " ++ full ++ "!")

The code in do notation is readable and easy to follow. The <- notation makes it possible to treat first and last names as if they were pure variables, though they never can be in reality: function getLine is not pure because it can give a different result every time it is run.

A possible translation into vanilla monadic code would be:

nameLambda :: IO ()
nameLambda = putStr "What is your first name? " >>
             getLine >>= \first ->
             putStr "And your last name? " >>
             getLine >>= \last ->
             let full = first ++ " " ++ last
             in putStrLn ("Pleased to meet you, " ++ full ++ "!")

In cases like this, in which we just want to chain several actions, the imperative style of do notation feels natural, and can be pretty convenient. In comparison, monadic code with explicit binds and lambdas is something of an acquired taste.

The example includes a let statement in the do block. They are translated to a regular let expression, with the in part being the translation of whatever follows it in the do block (in the example, that means the final putStrLn).

Returning values[edit]

The last statement in a do notation is the result of the do block. In the previous example, the result was of the type IO (), that is an empty value in the IO monad.

Suppose that we want to rewrite the example, but returning a IO String with the acquired name. All we need to do is add a return:

nameReturn :: IO String
nameReturn = do putStr "What is your first name? "
                first <- getLine
                putStr "And your last name? "
                last <- getLine
                let full = first ++ " " ++ last
                putStrLn ("Pleased to meet you, " ++ full ++ "!")
                return full

This example will "return" the full name as a string inside the IO monad, which can then be utilized downstream elsewhere:

greetAndSeeYou :: IO ()
greetAndSeeYou = do name <- nameReturn
                    putStrLn ("See you, " ++ name ++ "!")

Here, the name will be obtained from user input and the greeting will be printed as side effects of nameReturn. Its return value will then be used to prepare the goodbye message.

This kind of code is why it is so easy to misunderstand the nature of return: it does not only share a name with C's keyword, it seems to have the same function here. A small variation on the example, however, will dispel that impression:

nameReturnAndCarryOn = do putStr "What is your first name? "
                          first <- getLine
                          putStr "And your last name? "
                          last <- getLine
                          let full = first++" "++last
                          putStrLn ("Pleased to meet you, "++full++"!")
                          return full
                          putStrLn "I am not finished yet!"

The string in the extra line will be printed out, as return is not a final statement interrupting the flow like in C and other languages. Indeed, the type of nameReturnAndCarryOn is IO (), the type of the final putStrLn action, and after the function is called the IO String created by the return full will disappear without a trace.

It is just sugar[edit]

The do notation is just a syntactical convenience; it does not add anything essential. Keeping that in mind, we can raise a few points about style. First of all, do is never necessary for a single action; and so the Haskell "Hello world" is simply...

main = putStrLn "Hello world!"

...without any do in sight. Additionally, snippets like this one are always redundant:

fooRedundant = do x <- bar
                  return x

Thanks to the monad laws, we can (and should!) write it as:

foo = bar

A subtler but crucial point is related to function composition. As we already know, the greetAndSeeYou action in the section just above could be rewritten as:

greetAndSeeYou :: IO ()
greetAndSeeYou = nameReturn >>= \name -> putStrLn ("See you, " ++ name ++ "!")

While you might find the lambda a little unsightly, suppose we had a printSeeYou function defined elsewhere:

printSeeYou :: String -> IO ()
printSeeYou name = putStrLn ("See you, " ++ name ++ "!")

Things suddenly look much nicer, and arguably even nicer than in the do version:

greetAndSeeYou :: IO ()
greetAndSeeYou = nameReturn >>= printSeeYou

Or, if we had a non-monadic seeYou function:

seeYou :: String -> String
seeYou name = "See you, " ++ name ++ "!"
-- Reminder: liftM f m == m >>= return . f == fmap f m
greetAndSeeYou :: IO ()
greetAndSeeYou = liftM seeYou nameReturn >>= putStrLn

Keep in mind this last example with liftM; we will soon return to the theme of using non-monadic functions in monadic code, and why it can be useful.


  1. That explains why, as we pointed out in the "Pattern matching" chapter, pattern matching failures in list comprehensions are silently ignored.