Haskell/Lists and tuples
Lists and tuples are the two most fundamental ways of manipulating several values together, by grouping them into a single value.
Functions are one of the two major building blocks of any Haskell program. The other is the list. So let's switch over to the interpreter and build lists:
Prelude> let numbers = [1,2,3,4] Prelude> let truths = [True, False, False] Prelude> let strings = ["here", "are", "some", "strings"]
The square brackets delimit the list, and individual elements are separated by commas. The only important restriction is that all elements in a list must be of the same type. Trying to define a list with mixed-type elements results in a typical type error:
Prelude> let mixed = [True, "bonjour"] <interactive>:1:19: Couldn't match `Bool' against `[Char]' Expected type: Bool Inferred type: [Char] In the list element: "bonjour" In the definition of `mixed': mixed = [True, "bonjour"]
In addition to specifying the whole list at once using square brackets and commas, you can build them up piece by piece using the (:) operator. This process is often referred to as consing.
Example: Consing something on to a list
Prelude> let numbers = [1,2,3,4] Prelude> numbers [1,2,3,4] Prelude> 0:numbers [0,1,2,3,4]
When you cons something on to a list (something:someList), what you get back is another list. Therefore you can keep on consing for as long as you wish. It is important to note that the cons operator evaluates from right to left. Another (more general) way to think of it is that it takes the first value to its left and the whole expression to its right.
Example: Consing lots of things to a list
Prelude> 1:0:numbers [1,0,1,2,3,4] Prelude> 2:1:0:numbers [2,1,0,1,2,3,4] Prelude> 5:4:3:2:1:0:numbers [5,4,3,2,1,0,1,2,3,4]
In fact, this is how lists are actually built, by consing all elements to the empty list, . The commas-and-brackets notation is just syntactic sugar, a more pleasant way to write code. So [1,2,3,4,5] is exactly equivalent to 1:2:3:4:5:
You will, however, want to watch out for a potential pitfall in list construction. Whereas something like True:False: is perfectly good Haskell, True:False is not:
Prelude> True:False <interactive>:1:5: Couldn't match `[Bool]' against `Bool' Expected type: [Bool] Inferred type: Bool In the second argument of `(:)', namely `False' In the definition of `it': it = True : False
True:False produces a familiar-looking type error message, which tells us that the cons operator (:) (which is really just a function) expected a list as its second argument but we gave it another Bool instead. (:) only knows how to stick things onto lists.
So, when using cons, remember:
- The elements of the list must have the same type.
- You can only cons
(:)something onto a list, not the other way around (you cannot cons a list onto an element). So, the final item on the right must be a list, and the items on the left must be independent elements, not lists.
Strings are just lists
As we briefly mentioned in the Type Basics module, strings in Haskell are just lists of characters. That means values of type String can be manipulated just like any other list. For instance, instead of entering strings directly as a sequence of characters enclosed in double quotation marks, they may also be constructed through a sequence of Char values, either linked with (:) and terminated by an empty list or using the commas-and-brackets notation.
Prelude>"hey" == ['h','e','y'] True Prelude>"hey" == 'h':'e':'y': True
Using double-quoted strings is just more syntactic sugar.
Lists within lists
Lists can contain anything, just as long as they are all of the same type. Because lists are things too, lists can contain other lists! Try the following in the interpreter:
Example: Lists can contain lists
Prelude> let listOfLists = [[1,2],[3,4],[5,6]] Prelude> listOfLists [[1,2],[3,4],[5,6]]
Lists of lists can be pretty tricky sometimes, because a list of things does not have the same type as a thing all by itself; the type Int, for example, is different from [Int]. Let's sort through these implications with a few exercises:
Lists of lists can be useful because they allow one to express some kinds of complicated, structured data (two-dimensional matrices, for example). They are also one of the places where the Haskell type system truly shines. Human programmers, or at least this wikibook co-author, get confused all the time when working with lists of lists, and having restrictions of types often helps in wading through the potential mess.
A different notion of many
Tuples are another way of storing multiple values in a single value. There are two key differences between tuples and lists:
- They have a fixed number of elements (immutable), and so you can't cons to a tuple. Therefore, it makes sense to use tuples when you know in advance how many values are to be stored. For example, we might want a type for storing 2D coordinates of a point. We know exactly how many values we need for each point (two – the x and y coordinates), so tuples are applicable.
- The elements of a tuple do not need to be all of the same type. For instance, in a phonebook application we might want to handle the entries by crunching three values into one: the name, phone number, and the address of each person. In such a case, the three values won't have the same type, so lists wouldn't help, but tuples would.
Tuples are created within parentheses with elements delimited by commas. Let's look at some sample tuples:
Example: Some tuples
(True, 1) ("Hello world", False) (4, 5, "Six", True, 'b')
The first example is a tuple containing two elements: the first one is True, and the second is 1. The next example again has two elements: the first is "Hello world", and the second is False. The third example is a tuple consisting of five elements: the first is 4 (a number), the second is 5 (another number), the third is "Six" (a string), the fourth is True (a boolean value), and the fifth is 'b' (a character).
A quick note on nomenclature: In general you use n-tuple to denote a tuple of size n. 2-tuples (that is, tuples with 2 elements) are normally called pairs and 3-tuples triples. Tuples of greater sizes aren't actually all that common, but if you were to logically extend the naming system, you'd have quadruples, quintuples and so on, hence the general term tuple.
Tuples are also handy when you want to return more than one value from a function. In many languages, returning two or more things at once often requires wrapping them up in a single-purpose data structure, maybe one that only gets used in that function. In Haskell, you have the very convenient alternative of just returning them as a tuple.
Tuples within tuples (and other combinations)
We can apply the same reasoning to tuples about storing lists within lists. Tuples are things too, so you can store tuples with tuples (within tuples up to any arbitrary level of complexity). Likewise, you could also have lists of tuples, tuples of lists, and all sorts of other combinations along the same lines.
Example: Nesting tuples and lists
((2,3), True) ((2,3), [2,3]) [(1,2), (3,4), (5,6)]
There is one bit of trickiness to watch out for, however. The type of a tuple is defined not only by its size, but by the types of objects it contains. For example, the tuples
(47,"World") are fundamentally different. One is of type
(String,Int), whereas the other is
(Int,String). This has implications for building up lists of tuples. We could very well have lists like
[("a",1),("b",9),("c",9)], but having a list like
[("a",1),(2,"b"),(9,"c")] is right out. Can you spot the difference?
Up to now we have seen how to put values into lists and tuples. If they are to be of any use, though, there must be a way of getting back the stored values! In this section, we will explore the surface of this issue.
Let us begin with pairs (that is, 2-tuples). A very common use for them is to store the (x, y) coordinates of a point: imagine you have a chess board, and want to specify a specific square. You could do this by labelling all the rows from 1 to 8, and similarly with the columns. Then, a pair
(2, 5) could represent the square in row 2 and column 5. Say we want to define a function for finding all the pieces in a given row. One way of doing this would be to find the coordinates of all the pieces, then look at the row part and see whether it's equal to whatever row we're being asked to examine. This function would need, once it had the coordinate pair
(x, y) of a piece, to extract the
x (the row coordinate). To do that there are two functions,
snd, that retrieve the first and second elements out of a pair, respectively. Let's see some examples:
Prelude> fst (2, 5) 2 Prelude> fst (True, "boo") True Prelude> snd (5, "Hello") "Hello"
Note that these functions only work on pairs. Why? Yet again, it has to do with types. Pairs and triples (and quadruples, etc.) have necessarily different types, and
snd only accept pairs as arguments.
As for lists, the functions
tail are roughly analogous to
snd, in that they disassemble a list by taking apart what
head evaluates to the first element of the list, while
tail gives the rest of the list.
Prelude> 2:[7,5,0] [2,7,5,0] Prelude> head [2,7,5,0] 2 Prelude> tail [2,7,5,0] [7,5,0]
The four functions introduced here do not appear to be enough to fully solve the problem we started this section with. While
snd provide a satisfactory solution for pairs, what about tuples with three or more elements? And with lists, it would be natural to wonder if we can't do any better than just breaking them after the first element -
tail just don't seem to cut it. For the moment, we will have to leave these questions pending, but don't worry - the issues are perfectly solvable. Once we do some necessary groundwork we will return to this subject, considering it in detail and, in particular, dedicating a number of chapters to list manipulation. There, we will understand how separating head and tail allows us to do anything we want with lists.
As you may have found out already by playing with :t, the type of a list depends on the types of its elements, and is denoted by enclosing it in square brackets:
Prelude>:t [True, False] [True, False] :: [Bool] Prelude>:t ["hey", "my"] ["hey", "my"] :: [[Char]]
Therefore, lists of Bool have different types than lists of [Char] (that is, strings), of Int and so on. Since we have seen that functions only accept arguments of the types specified in the type of the function, that leads to some practical complication. For example, imagine a function that finds the length of a list. But since [Int], [Bool] and [String] are different types it seems we would need separate functions for each case –
lengthInts :: [Int] -> Int, as well as a
lengthBools :: [Bool] -> Int, as well as a
lengthStrings :: [String] -> Int, as well as a...
That would be horribly annoying, and frustrating too, because intuitively it would seem that counting how many things there are in a list should be independent of what the things actually are. Fortunately, it does not work like that: there is a single function length, which works on all lists. But how can that possibly work? As usual, checking the type of length provides a good hint that there is something different going on...
Example: Our first polymorphic type
Prelude>:t length length :: [a] -> Int
The a in the square brackets is not a type – remember that type names always start with uppercase letters. Instead, it is a type variable. When Haskell sees a type variable, it allows any type to take its place. This is exactly what we want. In type theory (a branch of mathematics), this is called polymorphism: functions or values with only a single type are called monomorphic, and things that use type variables to admit more than one type are polymorphic.
It is important to note that, in a single type signature, all cases of a certain type variable must be of the same type. For example,
f :: a -> a
means that f takes an argument of any type and gives something of the same type as the argument, as opposed to
f :: a -> b
which means that f takes an argument of any type and gives an argument of another type, which is not necessarily the same type, but can be.
As we saw, you can use the
snd functions to extract parts of pairs. By this time you should already be developing the habit of wondering "what type is that function?" about every function you come across. Let's consider the cases of
snd . These two functions take a pair as their argument and return one element of this pair. First of all, the type of a pair depends on the type of its elements, just as with lists, so the functions need to be polymorphic. Also it is important to keep in mind that pairs, and tuples in general, don't have to be homogeneous with respect to types; their different parts can be different types. So if we were to say:
fst :: (a, a) -> a
That would mean fst would only work if the first and second part of the pair given as input had the same type. So what is the correct type? Simply:
Example: The types of
fst :: (a, b) -> a snd :: (a, b) -> b
b is a second type variable, which stands for a type which may or may not be equal to the one which replaces a.
Note that if you knew nothing about fst and snd other than the type signatures you might guess that they return the first and second parts of a pair, respectively. Although that is correct, it is not necessarily true as all the signatures say is that they just have to return something with the same type of the first and second parts of the pair.
Give type signatures for the following functions:
We have introduced two new notions in this chapter, lists and tuples. Let us sum up the key similarities and differences between them:
- Lists are defined by square brackets and commas :
- They can contain anything as long as all the candidate elements of the list are of the same type
- They can also be built by the cons operator,
(:), but you can only cons things onto lists
- Tuples are defined by parentheses and commas :
- They can contain anything, even things of different types
- Their length is encoded in their type. That is, two tuples with different lengths will have different types.
- Lists and tuples can be combined in any number of ways: lists within lists, tuples with lists, etc, but their criteria must still be fulfilled for the combinations to be valid.
- You might object that "cons" isn't even a proper word. Well, it isn't. LISP programmers invented the verb "to cons" to refer to this specific task of appending an element to the front of a list. "cons" happens to be a mnemonic for "constructor". Later on we will see why that makes sense.
- At this point you might be justified in seeing types as an annoying thing. In a way they are, but more often than not they are actually a lifesaver. In any case, when you are programming in Haskell and something blows up, you'll probably want to think "type error".
- Or, more technically, which project. In math-speak, a function that gets some data out of a structure is called a projection.