Yet Another Haskell Tutorial/Language advanced
Contents
Sections and Infix Operators[edit]
We've already seen how to double the values of elements in a list using map
:
Example:
Prelude> map (\x > x*2) [1,2,3,4] [2,4,6,8]
However, there is a more concise way to write this:
Example:
Prelude> map (*2) [1,2,3,4] [2,4,6,8]
This type of thing can be done for any infix function:
Example:
Prelude> map (+5) [1,2,3,4] [6,7,8,9] Prelude> map (/2) [1,2,3,4] [0.5,1.0,1.5,2.0] Prelude> map (2/) [1,2,3,4] [2.0,1.0,0.666667,0.5]
You might be tempted to try to subtract values from elements in a list by mapping 2
across a list. This won't work, though, because while the +
in +2
is parsed as the standard plus operator (as there is no ambiguity), the 
in 2
is interpreted as the unary minus, not the binary minus. Thus 2
here is the number , not the function .
In general, these are called sections. For binary infix operators (like +
), we can cause the function to become prefix by enclosing it in paretheses. For example:
Example:
Prelude> (+) 5 3 8 Prelude> () 5 3 2
Additionally, we can provide either of its argument to make a section. For example:
Example:
Prelude> (+5) 3 8 Prelude> (/3) 6 2.0 Prelude> (3/) 6 0.5
Noninfix functions can be made infix by enclosing them in backquotes ("\`"). For example:
Example:
Prelude> (+2) `map` [1..10] [3,4,5,6,7,8,9,10,11,12] Prelude> (`map` [1..10]) (+2) [3,4,5,6,7,8,9,10,11,12]
Local Declarations[edit]
Recall back from the section on Functions, there are many computations which require using the result of the same computation in multiple places in a function. There, we considered the function for computing the roots of a quadratic polynomial:
roots a b c = ((b + sqrt(b*b  4*a*c)) / (2*a), (b  sqrt(b*b  4*a*c)) / (2*a))
In addition to the let bindings introduced there, we can do this using a where clause. where
clauses come immediately after function definitions and introduce a new level of layout (see the section on Layout). We write this as:
roots a b c = ((b + det) / (2*a), (b  det) / (2*a)) where det = sqrt(b*b4*a*c)
Any values defined in a where clause shadow any other values with the same name. For instance, if we had the following code block:
det = "Hello World" roots a b c = ((b + det) / (2*a), (b  det) / (2*a)) where det = sqrt(b*b4*a*c) f _ = det
The value of roots
doesn't notice the toplevel declaration of det
, since it is shadowed by the local definition (the fact that the types don't match doesn't matter either). Furthermore, since f
cannot "see inside" of roots
, the only thing it knows about det
is what is available at the top level, which is the string "Hello World."
We could also pull out the 2*a
computation and get the following code:
roots a b c = ((b + det) / (a2), (b  det) / (a2)) where det = sqrt(b*b4*a*c) a2 = 2*a
Subexpressions in where clauses must come after function definitions. Sometimes it is more convenient to put the local definitions before the actual expression of the function. This can be done by using let/in clauses. We have already seen let clauses; where clauses are virtually identical to their let clause cousins except for their placement. The same roots
function can be written using let as:
roots a b c = let det = sqrt (b*b  4*a*c) a2 = 2*a in ((b + det) / a2, (b  det) / a2)
Using a where clause, it looks like:
roots a b c = ((b + det) / a2, (b  det) / a2) where det = sqrt (b*b  4*a*c) a2 = 2*a
These two types of clauses can be mixed (i.e., you can write a function which has both a let cause and a where clause). This is strongly advised against, as it tends to make code difficult to read. However, if you choose to do it, values in the let clause shadow those in the where clause. So if you define the function:
f x = let y = x+1 in y where y = x+2
The value of f 5
is 6
, not 7
. Of course, I plead with you to never ever write code that looks like this. No one should have to remember this rule and by shadowing wheredefined values in a let clause only makes your code difficult to understand.
In general, whether you should use let clauses or where clauses is largely a matter of personal preference. Usually, the names you give to the subexpressions should be sufficiently expressive that without reading their definitions any reader of your code should be able to figure out what they do. In this case, where clauses are probably more desirable because they allow the reader to see immediately what a function does. However, in real life, values are often given cryptic names.
Partial Application[edit]
Partial application is when you take a function which takes arguments and you supply it with of them. When discussing Sections, we saw a form of "partial application" in which functions like +
were partially applied. For instance, in the expression map (+1) [1,2,3]
, the section (+1)
is a partial application of +
. This is because +
really takes two arguments, but we've only given it one.
Partial application is very common in function definitions and sometimes goes by the name "η (eta) reduction". For instance, suppose we are writing a function lcaseString
which converts a whole string into lower case. We could write this as:
lcaseString s = map toLower s
Here, there is no partial application (though you could argue that applying no arguments to toLower
could be considered partial application). However, we notice that the application of s
occurs at the end of both lcaseString
and of map toLower
. In fact, we can remove it by performing eta reduction, to get:
lcaseString = map toLower
Now, we have a partial application of map
: it expects a function and a list, but we've only given it the function.
This all is related to the type of map
, which is (a > b) > ([a] > [b])
, when all parentheses are included. In our case, toLower
is of type Char > Char
. Thus, if we supply this function to map
, we get a function of type [Char] > [Char]
, as desired.
Now, consider the task of converting a string to lowercase and removing all non letter characters. We might write this as:
lcaseLetters s = map toLower (filter isAlpha s)
But note that we can actually write this in terms of function composition:
lcaseLetters s = (map toLower . filter isAlpha) s
And again, we're left with an eta reducible function:
lcaseLetters = map toLower . filter isAlpha
Writing functions in this style is very common among advanced Haskell users. In fact it has a name: pointfree programming (not to be confused with pointless programming). It is called point free because in the original definition of lcaseLetters
, we can think of the value s
as a point on which the function is operating. By removing the point from the function definition, we have a pointfree function.
A function similar to (.)
is ($)
. Whereas (.)
is function composition, ($)
is function application. The definition of ($)
from the Prelude is very simple:
f $ x = f x
However, this function is given very low fixity, which means that it can be used to replace parentheses. For instance, we might write a function:
foo x y = bar y (baz (fluff (ork x)))
However, using the function application function, we can rewrite this as:
foo x y = bar y $ baz $ fluff $ ork x
This moderately resembles the function composition syntax. The ($)
function is also useful when combined with other infix functions. For instance, we cannot write:
Example:
Prelude> putStrLn "5+3=" ++ show (5+3)
because this is interpreted as (putStrLn "5+3=") ++ (show (5+3))
, which makes no sense. However, we can fix this by writing instead:
Example:
Prelude> putStrLn $ "5+3=" ++ show (5+3)
Which works fine.
Consider now the task of extracting from a list of tuples all the ones whose first component is greater than zero. One way to write this would be:
fstGt0 l = filter (\ (a,b) > a>0) l
We can first apply eta reduction to the whole function, yielding:
fstGt0 = filter (\ (a,b) > a>0)
Now, we can rewrite the lambda function to use the fst
function instead of the pattern matching:
fstGt0 = filter (\x > fst x > 0)
Now, we can use function composition between fst
and >
to get:
fstGt0 = filter (\x > ((>0) . fst) x)
And finally we can eta reduce:
fstGt0 = filter ((>0).fst)
This definition is simultaneously shorter and easier to understand than the original. We can clearly see exactly what it is doing: we're filtering a list by checking whether something is greater than zero. What are we checking? The fst
element.
While converting to point free style often results in clearer code, this is of course not always the case. For instance, converting the following map to point free style yields something nearly uninterpretable:
foo = map (\x > sqrt (3+4*(x^2))) foo = map (sqrt . (3+) . (4*) . (^2))
There are a handful of combinators defined in the Prelude which are useful for point free programming:
uncurry
takes a function of typea > b > c
and converts it into a function of type(a,b) > c
. This is useful, for example, when mapping across a list of pairs:
Example:
Prelude> map (uncurry (*)) [(1,2),(3,4),(5,6)] [2,12,30]
curry
is the opposite ofuncurry
and takes a function of type(a,b) > c
and produces a function of typea > b > c
.
flip
reverse the order of the first two arguments to a function. That is, it takes a function of typea > b > c
and produces a function of typeb > a > c
. For instance, we can sort a list in reverse order by usingflip compare
:
Example:
Prelude> List.sortBy compare [5,1,8,3] [1,3,5,8] Prelude> List.sortBy (flip compare) [5,1,8,3] [8,5,3,1]
This is the same as saying:
Example:
Prelude> List.sortBy (\a b > compare b a) [5,1,8,3] [8,5,3,1]
only shorter.
Of course, not all functions can be written in point free style. For instance:
square x = x*x
Cannot be written in point free style, without some other combinators. For instance, if we can define other functions, we can write:
pair x = (x,x) square = uncurry (*) . pair
But in this case, this is not terribly useful.
Exercises 

Convert the following functions into pointfree style, if possible. func1 x l = map (\y > y*x) l func2 f g l = filter f (map g l) func3 f l = l ++ map f l func4 l = map (\y > y+2) (filter (\z > z `elem` [1..10]) (5:l)) func5 f l = foldr (\x y > f (y,x)) 0 l 
Pattern Matching[edit]
Pattern matching is one of the most powerful features of Haskell (and most functional programming languages). It is most commonly used in conjunction with case expressions, which we have already seen in the section on Functions. Let's return to our Color
example from the section on Datatypes. I'll repeat the definition we already had for the datatype:
data Color = Red  Orange  Yellow  Green  Blue  Purple  White  Black  Custom Int Int Int  R G B components deriving (Show,Eq)
We then want to write a function that will convert between something of type Color
and a triple of Int
s, which correspond to the RGB values, respectively. Specifically, if we see a Color
which is Red
, we want to return (255,0,0)
, since this is the RGB value for red. So we write that (remember that piecewise function definitions are just case statements):
colorToRGB Red = (255,0,0)
If we see a Color
which is Orange
, we want to return (255,128,0)
; and if we see Yellow
, we want to return (255,255,0)
, and so on. Finally, if we see a custom color, which is comprised of three components, we want to make a triple out of these, so we write:
colorToRGB Orange = (255,128,0) colorToRGB Yellow = (255,255,0) colorToRGB Green = (0,255,0) colorToRGB Blue = (0,0,255) colorToRGB Purple = (255,0,255) colorToRGB White = (255,255,255) colorToRGB Black = (0,0,0) colorToRGB (Custom r g b) = (r,g,b)
Then, in our interpreter, if we type:
Example:
Color> colorToRGB Yellow (255,255,0)
What is happening is this: we create a value, call it , which has value Yellow
. We then apply this to colorToRGB
. We check to see if we can "match" against Red
. This match fails because according to the definition of Eq
{Color}, Red
is not equal to Yellow
. We continue down the definitions of colorToRGB
and try to match Yellow
against Orange
. This fails, too. We the try to match Yellow
against Yellow
, which succeeds, so we use this function definition, which simply returns the value (255,255,0)
, as expected.
Suppose instead, we used a custom color:
Example:
Color> colorToRGB (Custom 50 200 100) (50,200,100)
We apply the same matching process, failing on all values from Red
to Black
. We then get to try to match Custom 50 200 100
against Custom r g b
. We can see that the Custom
part matches, so then we go see if the subelements match. In the matching, the variables r
, g
and b
are essentially wild cards, so there is no trouble matching r
with 50, g
with 200 and b
with 100. As a "sideeffect" of this matching, r
gets the value 50, g
gets the value 200 and b
gets the value 100. So the entire match succeeded and we look at the definition of this part of the function and bundle up the triple using the matched values of r
, g
and b
.
We can also write a function to check to see if a Color
is a custom color or not:
isCustomColor (Custom _ _ _) = True isCustomColor _ = False
When we apply a value to isCustomColor
it tries to match that value against Custom _ _ _
. This match will succeed if the value is Custom x y z
for any x
, y
and z
. The _
(underscore) character is a "wildcard" and will match anything, but will not do the binding that would happen if you put a variable name there. If this match succeeds, the function returns True
; however, if this match fails, it goes on to the next line, which will match anything and then return False
.
For some reason we might want to define a function which tells us whether a given color is "bright" or not, where my definition of "bright" is that one of its RGB components is equal to 255 (admittedly an arbitrary definition, but it's simply an example). We could define this function as:
isBright = isBright' . colorToRGB where isBright' (255,_,_) = True isBright' (_,255,_) = True isBright' (_,_,255) = True isBright' _ = False
Let's dwell on this definition for a second. The isBright
function is the composition of our previously defined function colorToRGB
and a helper function isBright'
, which tells us if a given RGB value is bright or not. We could replace the first line here with isBright c = isBright' (colorToRGB c)
but there is no need to explicitly write the parameter here, so we don't. Again, this function composition style of programming takes some getting used to, so I will try to use it frequently in this tutorial.
The isBright'
helper function takes the RGB triple produced by colorToRGB
. It first tries to match it against (255,_,_)
which succeeds if the value has 255 in its first position. If this match succeeds, isBright'
returns True
and so does isBright
. The second and third line of definition check for 255 in the second and third position in the triple, respectively. The fourth line, the fallthrough, matches everything else and reports it as not bright.
We might want to also write a function to convert between RGB triples and Color
s. We could simply stick everything in a Custom
constructor, but this would defeat the purpose; we want to use the Custom
slot only for values which don't match the predefined colors. However, we don't want to allow the user to construct custom colors like (600,40,99) since these are invalid RGB values. We could throw an error if such a value is given, but this can be difficult to deal with. Instead, we use the Maybe
datatype. This is defined (in the Prelude) as:
data Maybe a = Nothing  Just a
The way we use this is as follows: our rgbToColor
function returns a value of type Maybe Color
. If the RGB value passed to our function is invalid, we return Nothing
, which corresponds to a failure. If, on the other hand, the RGB value is valid, we create the appropriate Color
value and return Just
that. The code to do this is:
rgbToColor 255 0 0 = Just Red rgbToColor 255 128 0 = Just Orange rgbToColor 255 255 0 = Just Yellow rgbToColor 0 255 0 = Just Green rgbToColor 0 0 255 = Just Blue rgbToColor 255 0 255 = Just Purple rgbToColor 255 255 255 = Just White rgbToColor 0 0 0 = Just Black rgbToColor r g b = if 0 <= r && r <= 255 && 0 <= g && g <= 255 && 0 <= b && b <= 255 then Just (Custom r g b) else Nothing  invalid RGB value
The first eight lines match the RGB arguments against the predefined values and, if they match, rgbToColor
returns Just
the appropriate color. If none of these matches, the last definition of rgbToColor
matches the first argument against r
, the second against g
and the third against b
(which causes the sideeffect of binding these values). It then checks to see if these values are valid (each is greater than or equal to zero and less than or equal to 255). If so, it returns Just (Custom r g b)
; if not, it returns Nothing
corresponding to an invalid color.
Using this, we can write a function that checks to see if a right RGB value is valid:
rgbIsValid r g b = rgbIsValid' (rgbToColor r g b) where rgbIsValid' (Just _) = True rgbIsValid' _ = False
Here, we compose the helper function rgbIsValid'
with our function rgbToColor
. The helper function checks to see if the value returned by rgbToColor
is Just
anything (the wildcard). If so, it returns True
. If not, it matches anything and returns False
.
Pattern matching isn't magic, though. You can only match against datatypes; you cannot match against functions. For instance, the following is invalid:
f x = x + 1 g (f x) = x
Even though the intended meaning of g
is clear (i.e., g x = x  1
), the compiler doesn't know in general that f
has an inverse function, so it can't perform matches like this.
Guards[edit]
Guards can be thought of as an extension to the pattern matching facility. They enable you to allow piecewise function definitions to be taken according to arbitrary boolean expressions. Guards appear after all arguments to a function but before the equals sign, and are begun with a vertical bar. We could use guards to write a simple function which returns a string telling you the result of comparing two elements:
comparison x y  x < y = "The first is less"  x > y = "The second is less"  otherwise = "They are equal"
You can read the vertical bar as "such that." So we say that the value of comparison x y
"such that" x is less than y is "The first is less." The value such that x is greater than y is "The second is less" and the value otherwise is "They are equal". The keyword otherwise
is simply defined to be equal to True
and thus matches anything that falls through that far. So, we can see that this works:
Example:
Guards> comparison 5 10 "The first is less" Guards> comparison 10 5 "The second is less" Guards> comparison 7 7 "They are equal"
Guards are applied in conjunction with pattern matching. When a pattern matches, all of its guards are tried, consecutively, until one matches. If none match, then pattern matching continues with the next pattern.
One nicety about guards is that where clauses are common to all guards. So another possible definition for our isBright
function from the previous section would be:
isBright2 c  r == 255 = True  g == 255 = True  b == 255 = True  otherwise = False where (r,g,b) = colorToRGB c
The function is equivalent to the previous version, but performs its calculation slightly differently. It takes a color, c
, and applies colorToRGB
to it, yielding an RGB triple which is matched (using pattern matching!) against (r,g,b)
. This match succeeds and the values r
, g
and b
are bound to their respective values. The first guard checks to see if r
is 255 and, if so, returns true. The second and third guard check g
and b
against 255, respectively and return true if they match. The last guard fires as a last resort and returns False
.
Instance Declarations[edit]
In order to declare a type to be an instance of a class, you need to provide an instance declaration for it. Most classes provide what's called a "minimal complete definition." This means the functions which must be implemented for this class in order for its definition to be satisfied. Once you've written these functions for your type, you can declare it an instance of the class.
The Eq Class[edit]
The Eq
class has two members (i.e., two functions):
(==) :: Eq a => a > a > Bool (/=) :: Eq a => a > a > Bool
The first of these type signatures reads that the function ==
is a function which takes two a
s which are members of Eq
and produces a Bool
. The type signature of /=
(not equal) is identical. A minimal complete definition for the Eq
class requires that either one of these functions be defined (if you define ==
, then /=
is defined automatically by negating the result of ==
, and vice versa). These declarations must be provided inside the instance declaration.
This is best demonstrated by example. Suppose we have our color example, repeated here for convenience:
data Color = Red  Orange  Yellow  Green  Blue  Purple  White  Black  Custom Int Int Int  R G B components
We can define Color
to be an instance of Eq
by the following declaration:
instance Eq Color where Red == Red = True Orange == Orange = True Yellow == Yellow = True Green == Green = True Blue == Blue = True Purple == Purple = True White == White = True Black == Black = True (Custom r g b) == (Custom r' g' b') = r == r' && g == g' && b == b' _ == _ = False
The first line here begins with the keyword instance telling the compiler that we're making an instance declaration. It then specifies the class, Eq
, and the type, Color
which is going to be an instance of this class. Following that, there's the where keyword. Finally there's the method declaration.
The first eight lines of the method declaration are basically identical. The first one, for instance, says that the value of the expression Red == Red
is equal to True
. Lines two through eight are identical. The declaration for custom colors is a bit different. We pattern match Custom
on both sides of ==
. On the left hand side, we bind r
, g
and b
to the components, respectively. On the right hand side, we bind r'
, g'
and b'
to the components. We then say that these two custom colors are equal precisely when r == r'
, g == g'
and b == b'
are all equal. The fallthrough says that any pair we haven't previously declared as equal are unequal.
The Show Class[edit]
The Show
class is used to display arbitrary values as strings. This class has three methods:
show :: Show a => a > String showsPrec :: Show a => Int > a > String > String showList :: Show a => [a] > String > String
A minimal complete definition is either show
or showsPrec
(we will talk about showsPrec
later  it's in there for efficiency reasons). We can define our Color
datatype to be an instance of Show
with the following instance declaration:
instance Show Color where show Red = "Red" show Orange = "Orange" show Yellow = "Yellow" show Green = "Green" show Blue = "Blue" show Purple = "Purple" show White = "White" show Black = "Black" show (Custom r g b) = "Custom " ++ show r ++ " " ++ show g ++ " " ++ show b
This declaration specifies exactly how to convert values of type Color
to String
s. Again, the first eight lines are identical and simply take a Color
and produce a string. The last line for handling custom colors matches out the RGB components and creates a string by concatenating the result of show
ing the components individually (with spaces in between and "Custom" at the beginning).
Other Important Classes[edit]
There are a few other important classes which I will mention briefly because either they are commonly used or because we will be using them shortly. I won't provide example instance declarations; how you can do this should be clear by now.
The Ord Class[edit]
The ordering class, the functions are:
compare :: Ord a => a > a > Ordering (<=) :: Ord a => a > a > Bool (>) :: Ord a => a > a > Bool (>=) :: Ord a => a > a > Bool (<) :: Ord a => a > a > Bool min :: Ord a => a > a > a max :: Ord a => a > a > a
Almost any of the functions alone is a minimal complete definition; it is recommended that you implement compare
if you implement only one, though. This function returns a value of type Ordering
which is defined as:
data Ordering = LT  EQ  GT
So, for instance, we get:
Example:
Prelude> compare 5 7 LT Prelude> compare 6 6 EQ Prelude> compare 7 5 GT
In order to declare a type to be an instance of Ord
you must already have declared it an instance of Eq
(in other words, Ord
is a subclass of Eq
 more about this in the section on Classes).
The Enum Class[edit]
The Enum
class is for enumerated types; that is, for types where each element has a successor and a predecessor. Its methods are:
pred :: Enum a => a > a succ :: Enum a => a > a toEnum :: Enum a => Int > a fromEnum :: Enum a => a > Int enumFrom :: Enum a => a > [a] enumFromThen :: Enum a => a > a > [a] enumFromTo :: Enum a => a > a > [a] enumFromThenTo :: Enum a => a > a > a > [a]
The minimal complete definition contains both toEnum
and fromEnum
, which converts from and to Int
s. The pred
and succ
functions give the predecessor and successor, respectively. The enum
functions enumerate lists of elements. For instance, enumFrom x
lists all elements after x
; enumFromThen x step
lists all elements starting at x
in steps of size step
. The To
functions end the enumeration at the given element.
The Num Class[edit]
The Num
class provides the standard arithmetic operations:
() :: Num a => a > a > a (*) :: Num a => a > a > a (+) :: Num a => a > a > a negate :: Num a => a > a signum :: Num a => a > a abs :: Num a => a > a fromInteger :: Num a => Integer > a
All of these are obvious except for perhaps negate
which is the unary minus. That is, negate x
means .
The Read Class[edit]
The Read
class is the opposite of the Show
class. It is a way to take a string and read in from it a value of arbitrary type. The methods for Read
are:
readsPrec :: Read a => Int > String > [(a, String)] readList :: String > [([a], String)]
The minimal complete definition is readsPrec
. The most important function related to this is read
, which uses readsPrec
as:
read s = fst (head (readsPrec 0 s))
This will fail if parsing the string fails. You could define a maybeRead
function as:
maybeRead s = case readsPrec 0 s of [(a,_)] > Just a _ > Nothing
How to write and use readsPrec
directly will be discussed further in the examples.
Class Contexts[edit]
Suppose we are defining the Maybe
datatype from scratch. The definition would be something like:
data Maybe a = Nothing  Just a
Now, when we go to write the instance declarations, for, say, Eq
, we need to know that a
is an instance of Eq
otherwise we can't write a declaration. We express this as:
instance Eq a => Eq (Maybe a) where Nothing == Nothing = True (Just x) == (Just x') = x == x'
This first line can be read "That a
is an instance of Eq
implies (=>
) that Maybe a
is an instance of Eq
."
Deriving Classes[edit]
Writing obvious Eq
, Ord
, Read
and Show
classes like these is tedious and should be automated. Luckily for us, it is. If you write a datatype that's "simple enough" (almost any datatype you'll write unless you start writing fixed point types), the compiler can automatically derive some of the most basic classes. To do this, you simply add a deriving clause to after the datatype declaration, as in:
data Color = Red  ...  Custom Int Int Int  R G B components deriving (Eq, Ord, Show, Read)
This will automatically create instances of the Color
datatype of the named classes. Similarly, the declaration:
data Maybe a = Nothing  Just a deriving (Eq, Ord, Show, Read)
derives these classes just when a
is appropriate.
All in all, you are allowed to derive instances of Eq
, Ord
, Enum
, Bounded
, Show
and Read
. There is considerable work in the area of "polytypic programming" or "generic programming" which, among other things, would allow for instance declarations for any class to be derived. This is much beyond the scope of this tutorial; instead, I refer you to the literature.
Datatypes Revisited[edit]
I know by this point you're probably terribly tired of hearing about datatypes. They are, however, incredibly important, otherwise I wouldn't devote so much time to them. Datatypes offer a sort of notational convenience if you have, for instance, a datatype that holds many many values. These are called named fields.
Named Fields[edit]
Consider a datatype whose purpose is to hold configuration settings. Usually when you extract members from this type, you really only care about one or possibly two of the many settings. Moreover, if many of the settings have the same type, you might often find yourself wondering "wait, was this the fourth or fifth element?" One thing you could do would be to write accessor functions. Consider the following madeup configuration type for a terminal program:
data Configuration = Configuration String  user name String  local host String  remote host Bool  is guest? Bool  is super user? String  current directory String  home directory Integer  time connected deriving (Eq, Show)
You could then write accessor functions, like (I've only listed a few):
getUserName (Configuration un _ _ _ _ _ _ _) = un getLocalHost (Configuration _ lh _ _ _ _ _ _) = lh getRemoteHost (Configuration _ _ rh _ _ _ _ _) = rh getIsGuest (Configuration _ _ _ ig _ _ _ _) = ig ...
You could also write update functions to update a single element. Of course, now if you add an element to the configuration, or remove one, all of these functions now have to take a different number of arguments. This is highly annoying and is an easy place for bugs to slip in. However, there's a solution. We simply give names to the fields in the datatype declaration, as follows:
data Configuration = Configuration { username :: String, localhost :: String, remotehost :: String, isguest :: Bool, issuperuser :: Bool, currentdir :: String, homedir :: String, timeconnected :: Integer }
This will automatically generate the following accessor functions for us:
username :: Configuration > String localhost :: Configuration > String ...
Moreover, it gives us very convenient update methods. Here is a short example for a "post working directory" and "change directory" like functions that work on Configuration
s:
changeDir :: Configuration > String > Configuration changeDir cfg newDir =  make sure the directory exists if directoryExists newDir then  change our current directory cfg{currentdir = newDir} else error "directory does not exist" postWorkingDir :: Configuration > String  retrieve our current directory postWorkingDir cfg = currentdir cfg
So, in general, to update the field x
in a datatype y
to z
, you write y{x=z}
. You can change more than one; each should be separated by commas, for instance, y{x=z, a=b, c=d}
.
You can of course continue to pattern match against Configuration
s as you did before. The named fields are simply syntactic sugar; you can still write something like:
getUserName (Configuration un _ _ _ _ _ _ _) = un
But there is little reason to. Finally, you can pattern match against named fields as in:
getHostData (Configuration {localhost=lh,remotehost=rh}) = (lh,rh)
This matches the variable lh
against the localhost
field on the Configuration
and the variable rh
against the remotehost
field on the Configuration
. These matches of course succeed. You could also constrain the matches by putting values instead of variable names in these positions, as you would for standard datatypes.
You can create values of Configuration
in the old way as shown in the first definition below, or in the namedfield's type, as shown in the second definition below:
initCFG = Configuration "nobody" "nowhere" "nowhere" False False "/" "/" 0 initCFG' = Configuration { username="nobody", localhost="nowhere", remotehost="nowhere", isguest=False, issuperuser=False, currentdir="/", homedir="/", timeconnected=0 }
Though the second is probably much more understandable unless you litter your code with comments.
More Lists[edit]
to do: put something here
Standard List Functions[edit]
Recall that the definition of the builtin Haskell list datatype is equivalent to:
data List a = Nil  Cons a (List a)
With the exception that Nil
is called []
and Cons x xs
is called x:xs
. This is simply to make pattern matching easier and code smaller. Let's investigate how some of the standard list functions may be written. Consider map
. A definition is given below:
map _ [] = [] map f (x:xs) = f x : map f xs
Here, the first line says that when you map
across an empty list, no matter what the function is, you get an empty list back. The second line says that when you map
across a list with x
as the head and xs
as the tail, the result is f
applied to x
consed onto the result of mapping f
on xs
.
The filter
can be defined similarly:
filter _ [] = [] filter p (x:xs)  p x = x : filter p xs  otherwise = filter p xs
How this works should be clear. For an empty list, we return an empty list. For a non empty list, we return the filter of the tail, perhaps with the head on the front, depending on whether it satisfies the predicate p
or not.
We can define foldr
as:
foldr _ z [] = z foldr f z (x:xs) = f x (foldr f z xs)
Here, the best interpretation is that we are replacing the empty list ([]
) with a particular value and the list constructor (:
) with some function. On the first line, we can see the replacement of []
for z
. Using backquotes to make f
infix, we can write the second line as:
foldr f z (x:xs) = x `f` (foldr f z xs)
From this, we can directly see how :
is being replaced by f
.
Finally, foldl
:
foldl _ z [] = z foldl f z (x:xs) = foldl f (f z x) xs
This is slightly more complicated. Remember, z
can be thought of as the current state. So if we're folding across a list which is empty, we simply return the current state. On the other hand, if the list is not empty, it's of the form x:xs
. In this case, we get a new state by appling f
to the current state z
and the current list element x
and then recursively call foldl
on xs
with this new state.
There is another class of functions: the zip
and unzip
functions, which respectively take multiple lists and make one or take one lists and split them apart. For instance, zip
does the following:
Example:
Prelude> zip "hello" [1,2,3,4,5] [('h',1),('e',2),('l',3),('l',4),('o',5)]
Basically, it pairs the first elements of both lists and makes that the first element of the new list. It then pairs the second elements of both lists and makes that the second element, etc. What if the lists have unequal length? It simply stops when the shorter one stops. A reasonable definition for zip
is:
zip [] _ = [] zip _ [] = [] zip (x:xs) (y:ys) = (x,y) : zip xs ys
The unzip
function does the opposite. It takes a zipped list and returns the two "original" lists:
Example:
Prelude> unzip [('f',1),('o',2),('o',3)] ("foo",[1,2,3])
There are a whole slew of zip
and unzip
functions, named zip3
, unzip3
, zip4
, unzip4
and so on; the ...3
functions use triples instead of pairs; the ...4
functions use 4tuples, etc.
Finally, the function take
takes an integer and a list and returns the first elements off the list. Correspondingly, drop
takes an integer and a list and returns the result of throwing away the first elements off the list. Neither of these functions produces an error; if is too large, they both will just return shorter lists.
List Comprehensions[edit]
There is some syntactic sugar for dealing with lists whose elements are members of the Enum
class (see the section on Instances), such as Int
or Char
. If we want to create a list of all the elements from to , we can simply write:
Example:
Prelude> [1..10] [1,2,3,4,5,6,7,8,9,10]
We can also introduce an amount to step by:
Example:
Prelude> [1,3..10] [1,3,5,7,9] Prelude> [1,4..10] [1,4,7,10]
These expressions are short hand for enumFromTo
and enumFromThenTo
, respectively. Of course, you don't need to specify an upper bound. Try the following (but be ready to hit Control+C to stop the computation!):
Example:
Prelude> [1..] [1,2,3,4,5,6,7,8,9,10,11,12{Interrupted!}
Probably yours printed a few thousand more elements than this. As we said before, Haskell is lazy. That means that a list of all numbers from 1 on is perfectly well formed and that's exactly what this list is. Of course, if you attempt to print the list (which we're implicitly doing by typing it in the interpreter), it won't halt. But if we only evaluate an initial segment of this list, we're fine:
Example:
Prelude> take 3 [1..] [1,2,3] Prelude> take 3 (drop 5 [1..]) [6,7,8]
This comes in useful if, say, we want to assign an ID to each element in a list. Without laziness we'd have to write something like this:
assignID :: [a] > [(a,Int)] assignID l = zip l [1..length l]
Which means that the list will be traversed twice. However, because of laziness, we can simply write:
assignID l = zip l [1..]
And we'll get exactly what we want. We can see that this works:
Example:
Prelude> assignID "hello" [('h',1),('e',2),('l',3),('l',4),('o',5)]
Finally, there is some useful syntactic sugar for map
and filter
, based on standard setnotation in mathematics. In math, we would write something like to mean the set of all values of when applied to elements of which satisfy . This is equivalent to the Haskell statement map f (filter p s)
. However, we can also use more mathlike notation and write [f x  x < s, p x]
. While in math the ordering of the statements on the side after the pipe is free, it is not so in Haskell. We could not have put p x
before x < s
otherwise the compiler wouldn't know yet what x
was. We can use this to do simple string processing. Suppose we want to take a string, keep only the uppercase letters and convert those to lowercase. We could do this in either of the following two equivalent ways:
Example:
Prelude> map toLower (filter isUpper "Hello World") "hw" Prelude> [toLower x  x < "Hello World", isUpper x] "hw"
These two are equivalent, and, depending on the exact functions you're using, one might be more readable than the other. There's more you can do here, though. Suppose you want to create a list of pairs, one for each point between (0,0) and (5,7) below the diagonal. Doing this manually with lists and maps would be cumbersome and possibly difficult to read. It couldn't be easier than with list comprehensions:
Example:
Prelude> [(x,y)  x < [1..5], y < [x..7]] [(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),(2,2),(2,3), (2,4),(2,5),(2,6),(2,7),(3,3),(3,4),(3,5),(3,6),(3,7), (4,4),(4,5),(4,6),(4,7),(5,5),(5,6),(5,7)]
If you reverse the order of the x <
and y <
clauses, the order in which the space is traversed will be reversed (of course, in that case, y
could no longer depend on x
and you would need to make x
depend on y
but this is trivial).
Arrays[edit]
Lists are nice for many things. It is easy to add elements to the beginning of them and to manipulate them in various ways that change the length of the list. However, they are bad for random access, having average complexity to access an arbitrary element (if you don't know what means, you can either ignore it or take a quick detour and read the appendix chapter Complexity, a twopage introduction to complexity theory). So, if you're willing to give up fast insertion and deletion because you need random access, you should use arrays instead of lists.
In order to use arrays you must import the Array
module. There are a few methods for creating arrays, the array
function, the listArray
function, and the accumArray
function. The array
function takes a pair which is the bounds of the array, and an association list which specifies the initial values of the array. The listArray
function takes bounds and then simply a list of values. Finally, the accumArray
function takes an accumulation function, an initial value and an association list and accumulates pairs from the list into the array. Here are some examples of arrays being created:
Example:
Prelude> :m Array Prelude Array> array (1,5) [(i,2*i)  i < [1..5]] array (1,5) [(1,2),(2,4),(3,6),(4,8),(5,10)] Prelude Array> listArray (1,5) [3,7,5,1,10] array (1,5) [(1,3),(2,7),(3,5),(4,1),(5,10)] Prelude Array> accumArray (+) 2 (1,5) [(i,i)  i < [1..5]] array (1,5) [(1,3),(2,4),(3,5),(4,6),(5,7)]
When arrays are printed out (via the show function), they are printed with an association list. For instance, in the first example, the association list says that the value of the array at is , the value of the array at is , and so on.
You can extract an element of an array using the !
function, which takes an array and an index, as in:
Example:
Prelude Array> (listArray (1,5) [3,7,5,1,10]) ! 3 5
Moreover, you can update elements in the array using the //
function. This takes an array and an association list and updates the positions specified in the list:
Example:
Prelude Array> (listArray (1,5) [3,7,5,1,10]) // [(2,99),(3,99)] array (1,5) [(1,3),(2,99),(3,99),(4,1),(5,10)]
There are a few other functions which are of interest:
bounds 
returns the bounds of an array 
indices 
returns a list of all indices of the array 
elems 
returns a list of all the values in the array in order 
assocs 
returns an association list for the array 
If we define arr
to be listArray (1,5) [3,7,5,1,10]
, the result of these functions applied to arr
are:
Example:
Prelude Array> bounds arr (1,5) Prelude Array> indices arr [1,2,3,4,5] Prelude Array> elems arr [3,7,5,1,10] Prelude Array> assocs arr [(1,3),(2,7),(3,5),(4,1),(5,10)]
Note that while arrays are access, they are not update. They are in fact update, since in order to maintain purity, the array must be copied in order to make an update. Thus, functional arrays are pretty much only useful when you're filling them up once and then only reading. If you need fast access and update, you should probably use FiniteMap
s, which are discussed in the section on Finitemaps and have access and update.
Maps[edit]
This section is work in progress, please help improving it! 
The Map
datatype from the Data.Map
module is a purely functional implementation of balanced trees. Maps can be compared to lists and arrays in terms of the time it takes to perform various operations on those datatypes of a fixed size, . A brief comparison is:
List  Array  Map  

insert  
update  
delete  
find  
map 
As we can see, lists provide fast insertion (but slow everything else), arrays provide fast lookup (but slow everything else) and maps provide moderately fast everything.
The type of a map is for the form Map k a
where k
is the type of the keys and a
is the type of the elements. That is, maps are lookup tables from type k
to type a
.
The basic map functions are:
empty :: Map k a insert :: k > a > Map k a > Map k a delete :: k > Map k a > Map k a member :: k > Map k a > Bool lookup :: k > Map k a > a
In all these cases, the type k
must be an instance of Ord
(and hence also an instance of Eq
).
There are also function fromList
and toList
to convert lists to and from maps. Try the following:
Example:
Prelude> :m Data.Map Prelude Data.Map> let mymap = fromList [('a',5),('b',10),('c',1),('d',2)] Prelude Data.Map> let othermap = insert 'e' 6 mymap Prelude Data.Map> toList mymap [('a',5),('b',10),('c',1),('d',2)] Prelude Data.Map> toList othermap [('a',5),('b',10),('c',1),('d',2),('e',6)] Prelude Data.Map> Data.Map.lookup 'e' othermap 6 Prelude Data.Map> Data.Map.lookup 'e' mymap *** Exception: user error (Data.Map.lookup: Key not found)
Layout[edit]
The Final Word on Lists[edit]
You are likely tired of hearing about lists at this point, but they are so fundamental to Haskell (and really all of functional programming) that it would be terrible not to talk about them some more.
It turns out that foldr
is actually quite a powerful function: it can compute a primitive recursive function. A primitive recursive function is essentially one which can be calculated using only "for" loops, but not "while" loops.
In fact, we can fairly easily define map
in terms of foldr
:
map2 f = foldr (\a b > f a : b) []
Here, b
is the accumulator (i.e., the result list) and a
is the element being currently considered. In fact, we can simplify this definition through a sequence of steps:
foldr (\a b > f a : b) [] ==> foldr (\a b > (:) (f a) b) [] ==> foldr (\a > (:) (f a)) [] ==> foldr (\a > ((:) . f) a) [] ==> foldr ((:) . f) []
This is directly related to the fact that foldr (:) []
is the identity function on lists. This is because, as mentioned before, foldr f z
can be thought of as replacing the []
in lists by z
and the :
by f
. In this case, we're keeping both the same, so it is the identity function.
In fact, you can convert any function of the following style into a foldr
:
myfunc [] = z myfunc (x:xs) = f x (myfunc xs)
By writing the last line with f
in infix form, this should be obvious:
myfunc [] = z myfunc (x:xs) = x `f` (myfunc xs)
Clearly, we are just replacing []
with z
and :
with f
. Consider the filter
function:
filter p [] = [] filter p (x:xs) = if p x then x : filter p xs else filter p xs
This function also follows the form above. Based on the first line, we can figure out that z
is supposed to be []
, just like in the map
case. Now, suppose that we call the result of calling filter p xs
simply b
, then we can rewrite this as:
filter p [] = [] filter p (x:xs) = if p x then x : b else b
Given this, we can transform filter
into a fold:
filter p = foldr (\a b > if p a then a:b else b) []
Let's consider a slightly more complicated function: ++
. The definition for ++
is:
(++) [] ys = ys (++) (x:xs) ys = x : (xs ++ ys)
Now, the question is whether we can write this in fold notation. First, we can apply eta reduction to the first line to give:
(++) [] = id
Through a sequence of steps, we can also etareduce the second line:
(++) (x:xs) ys = x : ((++) xs ys) ==> (++) (x:xs) ys = (x:) ((++) xs ys) ==> (++) (x:xs) ys = ((x:) . (++) xs) ys ==> (++) (x:xs) = (x:) . (++) xs
Thus, we get that an etareduced definition of ++
is:
(++) [] = id (++) (x:xs) = (x:) . (++) xs
Now, we can try to put this into fold notation. First, we notice that the base case converts []
into id
. Now, if we assume (++) xs
is called b
and x
is called a
, we can get the following definition in terms of foldr
:
(++) = foldr (\a b > (a:) . b) id
This actually makes sense intuitively. If we only think about applying ++
to one argument, we can think of it as a function which takes a list and creates a function which, when applied, will prepend this list to another list. In the lambda function, we assume we have a function b
which will do this for the rest of the list and we need to create a function which will do this for b
as well as the single element a
. In order to do this, we first apply b
and then further add a
to the front.
We can further reduce this expression to a pointfree style through the following sequence:
==> (++) = foldr (\a b > (a:) . b) id ==> (++) = foldr (\a b > (.) (a:) b) id ==> (++) = foldr (\a > (.) (a:)) id ==> (++) = foldr (\a > (.) ((:) a)) id ==> (++) = foldr (\a > ((.) . (:)) a) id ==> (++) = foldr ((.) . (:)) id
This final version is point free, though not necessarily understandable. Presumbably the original version is clearer.
As a final example, consider concat
. We can write this as:
concat [] = [] concat (x:xs) = x ++ concat xs
It should be immediately clear that the z
element for the fold is []
and that the recursive function is ++
, yielding:
concat = foldr (++) []
Exercises 

