Haskell/Understanding monads/IO

Monads

Prologue: IO, an applicative functor
Understanding monads
Maybe
List
do notation
IO
State
Alternative and MonadPlus
Monad transformers

edit this chapter

Two defining features of Haskell are pure functions and lazy evaluation. All Haskell functions are pure, which means that, when given the same arguments, they return the same results. Lazy evaluation means that, by default, Haskell values are only evaluated when some part of the program requires them – perhaps never, if they are never used – and repeated evaluation of the same value is avoided wherever possible.

Pure functions and lazy evaluation bring forth a number of advantages. In particular, pure functions are reliable and predictable; they ease debugging and validation. Test cases can also be set up easily since we can be sure that nothing other than the arguments will influence a function's result. Being entirely contained within the program, the Haskell compiler can evaluate functions thoroughly in order to optimize the compiled code. However, input and output operations, which involve interaction with the world outside the confines of the program, can't be expressed through pure functions. Furthermore, in most cases I/O can't be done lazily. Since lazy computations are only performed when their values become necessary, unfettered lazy I/O would make the order of execution of the real world effects unpredictable.

There is no way to ignore this issue, as any useful program needs to do I/O, even if it is only to display a result. That being so, how do we manage actions like opening a network connection, writing a file, reading input from the outside world, or anything else that goes beyond calculating a value? The main insight is: actions are not functions. The IO type constructor provides a way to represent actions as Haskell values, so that we can manipulate them with pure functions. In the Prologue chapter, we anticipated some of the key features of this solution. Now that we also know that IO is a monad, we can wrap up the discussion we started there.

Combining functions and I/O actions

Let's combine functions with I/O to create a full program that will:

Ask the user to insert a string
Read their string
Use fmap to apply a function shout that capitalizes all the letters from the string
Write the resulting string

module Main where

import Data.Char (toUpper)
import Control.Monad

main = putStrLn "Write your string: " >> fmap shout getLine >>= putStrLn

shout = map toUpper

We have a full-blown program, but we didn't include any type definitions. Which parts are functions and which are IO actions or other values? We can load our program in GHCi and check the types:

main :: IO ()
putStrLn :: String -> IO ()
"Write your string: " :: [Char]
(>>) :: Monad m => m a -> m b -> m b
fmap :: Functor m => (a -> b) -> m a -> m b
shout :: [Char] -> [Char]
getLine :: IO String
(>>=) :: Monad m => m a -> (a -> m b) -> m b

Whew, that is a lot of information there. We've seen all of this before, but let's review.

main is IO (). That's not a function. Functions are of types a -> b. Our entire program is an IO action.

putStrLn is a function, but it results in an IO action. The "Write your string: " text is a String (remember, that's just a synonym for [Char]). It is used as an argument for putStrLn and is incorporated into the IO action that results. So, putStrLn is a function, but putStrLn x evaluates to an IO action. The () part of the IO type (called a unit type) indicates that nothing is available to be passed on to any later functions or actions.

That last part is key. We sometimes say informally that an IO action "returns" something; however, taking that too literally leads to confusion. It is clear what we mean when we talk about functions returning results, but IO actions are not functions. Let's skip down to getLine — an IO action that does provide a value. getLine is not a function that returns a String because getLine isn't a function. Rather, getLine is an IO action which, when evaluated, will materialize a String, which can then be passed to later functions through, for instance, fmap and (>>=).

When we use getLine to get a String, the value is monadic because it is wrapped in IO functor (which happens to be a monad). We cannot pass the value directly to a function that takes plain (non-monadic, or non-functorial) values. fmap does the work of taking a non-monadic function while passing in and returning monadic values.

As we've seen already, (>>=) does the work of passing a monadic value into a function that takes a non-monadic value and returns a monadic value. It may seem inefficient for fmap to take the non-monadic result of its given function and return a monadic value only for (>>=) to then pass the underlying non-monadic value to the next function. It is precisely this sort of chaining, however, that creates the reliable sequencing that make monads so effective at integrating pure functions with IO actions.

do notation review

Given the emphasis on sequencing, the do notation can be especially appealing with the IO monad. Our program

putStrLn "Write your string: " >> fmap shout getLine >>= putStrLn

could be written as:

do putStrLn "Write your string: "
   string <- getLine
   putStrLn (shout string)

The universe as part of our program

One way of viewing the IO monad is to consider IO a as a computation which provides a value of type a while changing the state of the world by doing input and output. Obviously, you cannot literally set the state of the world; it is hidden from you, as the IO functor is abstract (that is, you cannot dig into it to see the underlying values, a situation unlike what we have seen in the case of Maybe).

Understand that this idea of the universe as an object affected and affecting Haskell values through IO is only a metaphor; a loose interpretation at best. The more mundane fact is that IO simply brings some very base-level operations into the Haskell language.^[1] Remember that Haskell is an abstraction, and that Haskell programs must be compiled to machine code in order to actually run. The actual workings of IO happen at a lower level of abstraction, and are wired into the very definition of the Haskell language.^[2]

Pure and impure

The adjectives "pure" and "impure" often crop up while talking about I/O in Haskell. To clarify what they mean, we will revisit the discussion about referential transparency from the Prologue chapter. Consider the following snippet:

speakTo :: (String -> String) -> IO String
speakTo fSentence = fmap fSentence getLine

-- Usage example.
sayHello :: IO String
sayHello = speakTo (\name -> "Hello, " ++ name ++ "!")

In most other programming languages, which do not have separate types for I/O actions, speakTo would have a type akin to:

speakTo :: (String -> String) -> String

With such a type, however, speakTo would not be a function at all! Functions produce the same results when given the same arguments; the String delivered by speakTo, however, also depends on whatever is typed at the terminal prompt. In Haskell, we avoid that pitfall by returning an IO String, which is not a String but a promise that some String will be delivered by carrying out certain instructions involving I/O (in this case, the I/O consists of getting a line of input from the terminal). Though the String can be different each time speakTo is evaluated, the I/O instructions are always the same.

When we say Haskell is a purely functional language, we mean that all of its functions are really functions – or, in other words, that Haskell expressions are always referentially transparent. If speakTo had the problematic type we mentioned above, referential transparency would be violated: sayHello would be a String, and yet replacing it by any specific string would break the program.

In spite of Haskell being purely functional, IO actions can be said to be impure because their impacts on the outside world are side effects (as opposed to the regular effects that are entirely contained within Haskell). Programming languages that lack purity may have side-effects in many other places connected with various calculations. Purely functional languages, however, assure that even expressions with impure values are referentially transparent. That means we can talk about, reason about and handle impurity in a purely functional way, using purely functional machinery such as functors and monads. While IO actions are impure, all of the Haskell functions that manipulate them remain pure.

Functional purity, coupled with the fact that I/O shows up in types, benefits Haskell programmers in various ways. The guarantees about referential transparency increase a lot the potential for compiler optimizations. IO values being distinguishable through types alone make it possible to immediately tell where we are engaging with side effects or opaque values. As IO itself is just another functor, we maintain to the fullest extent the predictability and ease of reasoning associated with pure functions.

Functional and imperative

When we introduced monads, we said that a monadic expression can be interpreted as a statement of an imperative language. That interpretation is immediately compelling for IO, as the language around IO actions looks a lot like a conventional imperative language. It must be clear, however, that we are talking about an interpretation. We are not saying that monads or do notation turn Haskell into an imperative language. The point is merely that you can view and understand monadic code in terms of imperative statements. The semantics may be imperative, but the implementation of monads and (>>=) is still purely functional. To make this distinction clear, let's look at a little illustration:

int x;
scanf("%d", &x);
printf("%d\n", x);

This is a snippet of C, a typical imperative language. In it, we declare a variable x, read its value from user input with scanf and then print it with printf. We can, within an IO do block, write a Haskell snippet that performs the same function and looks quite similar:

x <- readLn
print x

Semantically, the snippets are nearly equivalent.^[3] In the C code, however, the statements directly correspond to instructions to be carried out by the program. The Haskell snippet, on the other hand, is desugared to:

readLn >>= \x -> print x

The desugared version has no statements, only functions being applied. We tell the program the order of the operations indirectly as a simple consequence of data dependencies: when we chain monadic computations with (>>=), we get the later results by applying functions to the results of the earlier ones. It just happens that, for instance, evaluating print x leads to a string being printed in the terminal.

When using monads, Haskell allows us to write code with imperative semantics while keeping the advantages of functional programming.

I/O in the libraries

So far the only I/O primitives we have used were putStrLn and getLine and small variations thereof. The standard libraries, however, offer many other useful functions and actions involving IO. We present some of the most important ones in the IO chapter in Haskell in Practice, including the basic functionality needed for reading from and writing to files.

Monadic control structures

Given that monads allow us to express sequential execution of actions in a wholly general way, could we use them to implement common iterative patterns, such as loops? In this section, we will present a few of the functions from the standard libraries which allow us to do precisely that. While the examples are presented here applied to IO, keep in mind that the following ideas apply to every monad.

Remember, there is nothing magical about monadic values; we can manipulate them just like any other values in Haskell. Knowing that, we might think to try the following function to get five lines of user input:

fiveGetLines = replicate 5 getLine

That won't do, however (try it in GHCi!). The problem is that replicate produces, in this case, a list of actions, while we want an action which returns a list (that is, IO [String] rather than [IO String]). What we need is a fold to run down the list of actions, executing them and combining the results into a single list. As it happens, there is a Prelude function which does that: sequence.

sequence :: (Monad m) => [m a] -> m [a]

And so, we get the desired action with:

fiveGetLines = sequence $ replicate 5 getLine

replicate and sequence form an appealing combination, so Control.Monad offers a replicateM function for repeating an action an arbitrary number of times. Control.Monad provides a number of other convenience functions in the same spirit - monadic zips, folds, and so forth.

fiveGetLinesAlt = replicateM 5 getLine

A particularly important combination is map and sequence. Together, they allow us to make actions from a list of values, run them sequentially, and collect the results. mapM, a Prelude function, captures this pattern:

mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]

We also have variants of the above functions with a trailing underscore in the name, such as sequence_, mapM_ and replicateM_. These discard any final values and so are appropriate when you are only interested in performing actions. Compared with their underscore-less counterparts, these functions are like the distinction between (>>) and (>>=). mapM_ for instance has the following type:

mapM_ :: (Monad m) => (a -> m b) -> [a] -> m ()

Finally, it is worth mentioning that Control.Monad also provides forM and forM_, which are flipped versions of mapM and mapM_. forM_ happens to be the idiomatic Haskell counterpart to the imperative for-each loop; and the type signature suggests that neatly:

forM_ :: (Monad m) => [a] -> (a -> m b) -> m ()

Exercises
Using the monadic functions we have just introduced, write a function which prints an arbitrary list of values. Generalize the bunny invasion example in the list monad chapter for an arbitrary number of generations. What is the expected behavior of `sequence` for the `Maybe` monad?

Notes

↑ The technical term is "primitive", as in primitive operations.
↑ The same can be said about all higher-level programming languages, of course. Incidentally, Haskell's IO operations can actually be extended via the Foreign Function Interface (FFI) which can make calls to C libraries. As C can use inline assembly code, Haskell can indirectly engage with anything a computer can do. Still, Haskell functions manipulate such outside operations only indirectly as values in IO functors.
↑ One difference is that x is a mutable variable in C, and so it is possible to declare it in one statement and set its value in the next; Haskell never allows such mutability. If we wanted to imitate the C code even more closely, we could have used an IORef, which is a cell that contains a value which can be destructively updated. For obvious reasons, IORefs can only be used within the IO monad.