Haskell/Indentation
Haskell relies on indentation to reduce the verbosity of your code, but working with the indentation rules can be a bit confusing. The rules may seem many and arbitrary, but the reality of things is that there are only one or two layout rules, and all the seeming complexity and arbitrariness comes from how these rules interact with your code. So to take the frustration out of indentation and layout, the simplest solution is to get a grip on these rules.[1]
The golden rule of indentation[edit]
While the rest of this chapter will discuss in detail Haskell's indentation system, you will do fairly well if you just remember a single rule: Code which is part of some expression should be indented further in than the beginning of that expression (even if the expression is not the leftmost element of the line).
What does that mean? The easiest example is a let binding group. The equations binding the variables are part of the let expression, and so should be indented further in than the beginning of the binding group: the let keyword. So,
let x = a y = b
When you start the expression on a separate line, you only need to indent by one space. However, it's more normal to place the first line alongside the 'let' and indent the rest to line up:
| Wrong | Wrong | Right |
|---|---|---|
let x = a y = b |
let x = a
y = b
|
let x = a
y = b
|
This tends to trip up a lot of beginners: When starting a group inline, all expressions in the group have to be exactly aligned with the position of the first expression (Haskell counts everything to the left of the first expression as indent, even though it is not whitespace).
Here are some more examples:
do foo
bar
baz
where x = a
y = b
case x of
p -> foo
p' -> baz
Note that with 'case' it's less common to place the first subsidiary expression on the same line as the 'case' keyword, unlike common practice with 'do' and 'where' expression. Hence the subsidiary expressions in a case expression tend to be indented only one step further than the 'case' line. Also note we lined up the arrows here: this is purely aesthetic and isn't counted as different layout; only indentation, whitespace beginning on the far-left edge, makes a difference to layout.
Things get more complicated when the beginning of the expression doesn't start at the left-hand edge. In this case, it's safe to just indent further than the line containing the expression's beginning. So,
myFunction firstArgument secondArgument = do -- the 'do' doesn't start at the left-hand edge foo -- so indent these commands more than the beginning of the line containing the 'do'. bar baz
Here are some alternative layouts which work:
myFunction firstArgument secondArgument =
do foo
bar
baz
myFunction firstArgument secondArgument = do foo
bar
baz
A mechanical translation[edit]
Did you know that indentation layout is optional? It is entirely possible to write in Haskell as in a "one-dimensional" language like C, using semicolons to separate things and curly braces to group them back. Not only it can be occasionally useful to write code in this style, but also understanding how to convert from one style to the other can help understand the indentation rules. To do so, you need to understand two things: where we need semicolons/braces, and how to get there from layout. The entire layout process can be summed up in three translation rules (plus a fourth one that doesn't come up very often):
- If you see one of the layout keywords, (
let,where,of,do), insert an open curly brace (right before the stuff that follows it) - If you see something indented to the SAME level, insert a semicolon
- If you see something indented LESS, insert a closing curly brace
- If you see something unexpected in a list, like
where, insert a closing brace before instead of a semicolon.
| Exercises |
|---|
| Answer in one word: what happens if you see something indented MORE? |
| Exercises |
|---|
Translate the following layout into curly braces and semicolons. Note: to underscore the mechanical nature of this process, we deliberately chose something which is probably not valid Haskell:
of a
b
c
d
where
a
b
c
do
you
like
the
way
i let myself
abuse
these
layout rules
|
Layout in action[edit]
| Wrong | Right |
|---|---|
do first thing second thing third thing |
do first thing
second thing
third thing
|
do within if[edit]
What happens if we put a do expression with an if? Well, as we stated above, the keywords if then else, and anything else but the four layout keywords do not affect layout. So things remain exactly the same:
| Wrong | Right |
|---|---|
if foo
then do first thing
second thing
third thing
else do something else
|
if foo
then do first thing
second thing
third thing
else do something else
|
Indent to the first[edit]
Remember that, due to the "golden rule of indentation" described above, although the keyword do tells Haskell to insert a curly brace where the curly braces goes depends not on the do, but the thing that immediately follows it. For example, this weird-looking block of code is totally acceptable:
do
first thing
second thing
third thing
As a result, you could also write combined if/do combination like this:
| Wrong | Right |
|---|---|
if foo
then do first thing
second thing
third thing
else do something else
|
if foo
then do
first thing
second thing
third thing
else do something else
|
This is also the reason why you can write things like this
main = do first thing second thing
instead of
main =
do first thing
second thing
Both are acceptable
if within do[edit]
This is a combination which trips up many Haskell programmers. Why does the following block of code not work?
-- why is this bad? do first thing if condition then foo else bar third thing
Just to reiterate, the if then else block is not at fault for this problem. Instead, the issue is that the do block notices that the then part is indented to the same column as the if part, so it is not very happy, because from its point of view, it just found a new statement of the block. It is as if you had written the unsugared version on the right:
| sweet (layout) | unsweet |
|---|---|
-- why is this bad? do first thing if condition then foo else bar third thing |
-- still bad, just explicitly so
do { first thing
; if condition
; then foo
; else bar
; third thing }
|
Naturally, the Haskell compiler is confused because it thinks that you never finished writing your if expression, before writing a new statement. The compiler sees that you have written something like if condition;, which is clearly bad, because it is unfinished. So, in order to fix this, we need to indent the bottom parts of this if block a little bit inwards
| sweet (layout) | unsweet |
|---|---|
-- whew, fixed it!
do first thing
if condition
then foo
else bar
third thing
|
-- the fixed version without sugar
do { first thing
; if condition
then foo
else bar
; third thing }
|
This little bit of indentation prevents the do block from misinterpreting your then as a brand new expression. Of course, you might as well prefer to always add indentation before then and else, even when it is not really necessary. That wouldn't hurt legibility, and would avoid bad surprises like this one.
| Exercises |
|---|
The if-within-do issue has tripped up so many Haskellers that one programmer has posted a proposal to the Haskell prime initiative to add optional semicolons between if then else. How would that help? |
Notes[edit]
- ↑ See section 2.7 of The Haskell Report (lexemes) on layout.