Haskell/More about lists
We have already met the basic tools for working with lists. We can build lists up from the cons operator (:)
and the empty list []
, and we can take them apart by using a combination of recursion and pattern matching. In this chapter and the next, we will consider more indepth techniques for list processing and discover a bit of new notation. Here, we will get our first taste of characteristic Haskell features like infinite lists, list comprehensions, and higherorder functions.
Note
Throughout this chapter, you will read and write functions which sum, subtract, and multiply elements of lists. For simplicity's sake, we will pretend the list elements have to be of type Integer
. However, as you will recall from the discussions on Type basics II, there are many different types with the Num typeclass. As an exercise of sorts, you could figure out what the type signatures of such functions would be if we made them polymorphic, allowing for the list elements to have any type in the class Num
. To check your signatures, just omit them temporarily, load the functions into GHCi, use :t and let type inference guide you.
Rebuilding lists[edit]
We'll start by writing and analysing a function that doubles every element from a list of integers:
doubleList :: [Integer] > [Integer] doubleList [] = [] doubleList (n:ns) = (2 * n) : doubleList ns
Here, the base case is the empty list; and it evaluates to an empty list. Otherwise, doubleList builds up a new list by using (:). The first element of this new list is twice the head of the argument, and the rest of the result is obtained by recursively calling doubleList on the tail of the argument. If the tail happens to be an empty list, the base case will be invoked and recursion stops.^{[1]}
By understanding the recursive definition, we can picture what actually happens when we evaluate an expression such as
doubleList [1,2,3,4]
We can work it out longhand by substituting the argument into the function definition, just like schoolbook algebra:
doubleList 1:[2,3,4] = (1*2) : doubleList (2 : [3,4]) = (1*2) : (2*2) : doubleList (3 : [4]) = (1*2) : (2*2) : (3*2) : doubleList (4 : []) = (1*2) : (2*2) : (3*2) : (4*2) : doubleList [] = (1*2) : (2*2) : (3*2) : (4*2) : [] = 2 : 4 : 6 : 8 : [] = [2, 4, 6, 8]
Thus, we rebuilt the original list replacing every element by its double.
In this longhand evaluation exercise, the moment at which we choose to evaluate the multiplications does not affect the result. We could just as well have done them immediately after each recursive call of doubleList.^{[2]}
This flexibility on evaluation order is reflected in some important properties of Haskell. As a pure functional programming language, it is mostly left to the compiler to decide when to actually evaluate things. As a lazy evaluation language, evaluation is usually deferred until the value is really needed, which may even be never in some cases.^{[3]} From the programmer's point of view, evaluation order rarely matters.^{[4]}
Generalizing[edit]
Suppose that we are solving a problem for which we need not only a function to double a list but also one that tripled it. In principle, we could follow the same strategy and define:
tripleList :: [Integer] > [Integer] tripleList [] = [] tripleList (n:ns) = (3 * n) : tripleList ns
Both doubleList
and tripleList
have very limited applicability. Every time we needed multiplying the elements of a list by 4, 8, 17 etc. we would need to write a new listmultiplying function, and all of them would do nearly the same thing. An obvious improvement would be generalizing our function to allow multiplication by any number. Doing so requires a function that takes an Integer
multiplicand as well as a list of Integer
s. Here is a way to define it:
multiplyList :: Integer > [Integer] > [Integer] multiplyList _ [] = [] multiplyList m (n:ns) = (m * n) : multiplyList m ns
This example deploys _
as a "don't care" pattern. The multiplicand is not used for the base case, so instead of giving it a name (like m
, n
or ns
) it is explicitly ignored.
multiplyList
solves our current problem for any integer number:
Prelude> multiplyList 17 [1,2,3,4] [17,34,51,68]
Exercises 

Write the following functions and test them out. Don't forget the type signatures.
take , drop , and sum . 
Generalizing even further[edit]
We just wrote a function which can multiply the elements of a list by any Integer
. But even though multiplyList
is much more flexible than doubleList
, it's still limited. Initially, we had a function constrained to multiplying the elements by 2
, and we had to hardcode a new function to use a different number. With multiplyList
, we could still complain about being constrained to apply multiplication to the list elements. What if we wanted to instead add to each element or to compute the square of each element? We would be back rewriting the recursive function for each case.
A key functionality of Haskell will save the day. Since the solution can be quite surprising, we will approach it in a somewhat roundabout way. Consider the type signature of multiplyList
:
multiplyList :: Integer > [Integer] > [Integer]
The first thing to know is that the >
arrow in type signatures is right associative. That means we can read this signature as:
multiplyList :: Integer > ([Integer] > [Integer])
How should we understand that? It tells us that multiplyList
is a function that takes one Integer
argument and evaluates to another function, which in turn takes a list of Integer
s and returns another list of Integer
s.
Consider our earlier doubleList function redefined in terms of multiplyList
:
doubleList :: [Integer] > [Integer] doubleList xs = multiplyList 2 xs
A practical consequence of what we have just said is that we can, metaphorically speaking, cancel out the `xs`:
doubleList = multiplyList 2
This definition style (with no argument variables) is called "pointfree" style. The expressions here are perfectly wellformed and stands on their own, so writing the second argument of multiplyList
(which is the same as the only argument of doubleList
) is strictly unnecessary. Applying only one argument to multiplyList
doesn't fail to evaluate, it just gives us just a more specific function of type [Integer] > [Integer]
instead of finishing with a final [Integer]
value.
The trickery above illustrates that functions in Haskell behave much like any other value. Functions that return other functions, and functions can stand alone as objects without mentioning their arguments. It's as though functions were just a normal constants. Could we use functions themselves as arguments even? Well, that's the key to our dilemma with multiplyList
. We need a function that takes not only multiplication but any other appropriate function and applies the given function to the elements of a list:
applyToIntegers :: (Integer > Integer) > [Integer] > [Integer] applyToIntegers _ [] = [] applyToIntegers f (n:ns) = (f n) : applyToIntegers f ns
With applyToIntegers
, we can take any Integer > Integer
function and apply it to the elements of a list of Integer
s. We can, of course, use this generalized function to redefine multiplyList
. Specifically, we just use the (*)
function as the first argument for applyToIntegers:
multiplyList :: Integer > [Integer] > [Integer] multiplyList x = applyToIntegers ((*) x)
If all this abstraction is confusing you, consider a concrete example: When we multiply 5 * 7 in Haskell, the (*)
function doesn't just take two arguments at once, it actually first takes the 5 and returns a new 5* function; and that new function then takes one argument and multiplies whatever that is by 5. So, we then give the 7 to the 5* function, and that returns us our final evaluated number (35 in this example).
So, all functions in Haskell really take only one argument. However, as we know how many intermediate functions will have to be generated before our final result is returned, we can treat our functions as if they took many arguments. The number of arguments we generally talk about functions taking is actually the number of functions of one argument between the first argument and the final, nonfunctional result value.
The process of creating intermediate functions when feeding arguments into a complex function is called currying (named after Haskell Curry, who is also the namesake of the Haskell programming language).
The map
function[edit]
While applyToIntegers
has type (Integer > Integer) > [Integer] > [Integer]
, there is nothing specific to integers in its algorithm. Therefore, we could define versions such as applyToChars
, applyToStrings
and applyToLists
just by changing the type signature. That would be horribly wasteful, though: we did not climb all the way up to this point just to need a different function for each type! Furthermore, nothing prevents us from changing the signature to, for instance, (Integer > String) > [Integer] > [String]
; thus giving a function that takes a function Integer > String
and returns a function [Integer] > [String]
which applies the function originally passed as argument to each element of an Integer
list.
The final step of generalization, then, is to make a fully polymorphic version of applyToIntegers
, with signature (a > b) > [a] > [b]
. Such a function already exists in Prelude: it is called map and defined as:
map :: (a > b) > [a] > [b] map _ [] = [] map f (x:xs) = (f x) : map f xs
Using it, we can effortlessly implement functions as different as...
multiplyList :: Integer > [Integer] > [Integer] multiplyList x = map ((*) x)
... and...
heads :: [[a]] > [a] heads = map head
Prelude> heads [[1,2,3,4],[4,3,2,1],[5,10,15]] [1,4,5]
map
is the ideal general solution for applying a function to each element of a list. Our original doubleList
problem was just a very specific version of map
. Functions like map
which take other functions as arguments are called higherorder functions, and they are extremely useful. In particular, we will meet some other important higherorder functions used for list processing in the next chapter.
Exercises 


Tips and Tricks[edit]
Before we carry on with more list processing, let us make a few miscellaneous useful observations about lists in Haskell.
Dot Dot Notation[edit]
Haskell has a convenient shorthand for writing ordered lists of regularlyspaced integers. Some examples to illustrate it:
Code Result   [1..10] [1,2,3,4,5,6,7,8,9,10] [2,4..10] [2,4,6,8,10] [5,4..1] [5,4,3,2,1] [1,3..10] [1,3,5,7,9]
The same notation can be used with characters as well, and even with floating point numbers  though the latter is not necessarily a good idea due to rounding errors. Try this:
[0,0.1 .. 1]
Note
The ..
notation only works with sequences with fixed differences between consecutive elements. For instance, you cannot write...
[0,1,1,2,3,5,8..100]
... and expect to magically get back the rest of the Fibonacci series.^{[5]}
Infinite Lists[edit]
One of the most mindbending things about Haskell lists is that they are allowed to be infinite. For example, the following generates the infinite list of integers starting with 1:
[1..]
(If you try this in GHCi, remember you can stop an evaluation with Ctrlc).
The same effect could be achieved with a recursive function:
intsFrom n = n : intsFrom (n + 1)  note there is no base case! positiveInts = intsFrom 1
This works because Haskell uses lazy evaluation: it never actually evaluates more than it needs at any given moment. In most cases an infinite list can be treated just like an ordinary one. The program will only go into an infinite loop when evaluation would actually require all the values in the list. Examples of this include sorting or printing the entire list. However:
evens = doubleList [1..]
will define "evens" to be the infinite list [2,4,6,8..]. You can pass "evens" into other functions, and this will work as long as you only need to evaluate part of the list for your final result.
Infinite lists are quite useful in Haskell. Often it's more convenient to define an infinite list and then take the first few items than to create a finite list. Functions that process two lists in parallel generally stop with the shortest, so making the second one infinite avoids having to find the length of the first. An infinite list is often a handy alternative to the traditional endless loop at the top level of an interactive program.
A note about head
and tail
[edit]
Given the choice of using either the ( : )
pattern or head
/tail
to split lists, pattern matching is almost always preferable. While it may be tempting to use head
and tail
due to simplicity and terseness, it is all too easy to forget that they fail on empty lists  and runtime crashes are never a good thing. While the Prelude function null :: [a] > Bool
provides a sane way of checking for empty lists without pattern matching (it returns True
for empty lists and False
otherwise), matching an empty list tends to be cleaner and clearer than the corresponding ifthenelse expression.
Exercises 


Notes
 ↑ Had we forgotten the base case, once the recursion got to an empty list the (x:xs) pattern match would fail, and we would get an error.
 ↑ As long as none of the calculations result in an error or nontermination, which are not problems in this case.
 ↑ The compiler is still free to evaluate things sooner if it will improve efficiency.
 ↑ One exception is the case of infinite lists (!) which we will consider in a short while.
 ↑ http://en.wikipedia.org/wiki/Fibonacci_number