Haskell/Indentation

From Wikibooks, open books for an open world
Jump to: navigation, search



Haskell relies on indentation to reduce the verbosity of your code, but working with the indentation rules can be a bit confusing. The rules may seem many and arbitrary, but the reality of things is that there are only one or two layout rules, and all the seeming complexity and arbitrariness comes from how these rules interact with your code. So to take the frustration out of indentation and layout, the simplest solution is to get a grip on these rules.[1]

The golden rule of indentation[edit]

While the rest of this chapter will discuss in detail Haskell's indentation system, you will do fairly well if you just remember a single rule: Code which is part of some expression should be indented further in than the beginning of that expression (even if the expression is not the leftmost element of the line).

What does that mean? The easiest example is a 'let' binding group. The equations binding the variables are part of the 'let' expression, and so should be indented further in than the beginning of the binding group: the 'let' keyword. When you start the expression on a separate line, you only need to indent by one space (although more than one space is also acceptable).

let
 x = a
 y = b

Although the separation above makes things very clear, it is common to place the first line alongside the 'let' and indent the rest to line up:

wrong wrong right
let x = a
 y = b
let x = a
     y = b
let x = a
    y = b

This tends to trip up a lot of beginners: All grouped expressions must be exactly aligned. (On the first line, Haskell counts everything to the left of the expression as indent, even though it is not whitespace).


Here are some more examples:

do
  foo
  bar
  baz

do foo
   bar
   baz

where x = a
      y = b

case x of
  p  -> foo
  p' -> baz

Note that with 'case' it's less common to place the first subsidiary expression on the same line as the 'case' keyword, unlike common practice with 'do' and 'where' expression. Hence the subsidiary expressions in a case expression tend to be indented only one step further than the 'case' line. Also note we lined up the arrows here: this is purely aesthetic and isn't counted as different layout; only indentation, whitespace beginning on the far-left edge, makes a difference to layout.

Things get more complicated when the beginning of the expression doesn't start at the left-hand edge. In this case, it's safe to just indent further than the line containing the expression's beginning. So,

myFunction firstArgument secondArgument = do -- the 'do' doesn't start at the left-hand edge
  foo                                        -- so indent these commands more than the beginning of the line containing the 'do'.
  bar
  baz

Here are some alternative layouts which work:

myFunction firstArgument secondArgument = 
  do foo
     bar
     baz

myFunction firstArgument secondArgument = do foo
                                             bar
                                             baz
myFunction firstArgument secondArgument = 
  do
     foo
     bar
     baz

A mechanical translation[edit]

Indentation is actually optional if you instead separate things using semicolons and curly braces such as in "one-dimensional" languages like C. It may be occasionally useful to write code in this style, and understanding how to convert from one style to the other can help understand the indentation rules. To do so, you need to understand two things: where we need semicolons/braces, and how to get there from layout. The entire layout process can be summed up in three translation rules (plus a fourth one that doesn't come up very often):

  1. If you see one of the layout keywords, (let, where, of, do), insert an open curly brace (right before the stuff that follows it)
  2. If you see something indented to the SAME level, insert a semicolon
  3. If you see something indented LESS, insert a closing curly brace
  4. If you see something unexpected in a list, like where, insert a closing brace before instead of a semicolon.
Exercises
Answer in one word: what happens if you see something indented MORE?


Exercises
Translate the following layout into curly braces and semicolons. Note: to underscore the mechanical nature of this process, we deliberately chose something which is probably not valid Haskell:
  of a
     b
      c
     d
  where
  a
  b
  c
  do
 you
  like
 the
way
 i let myself
        abuse
       these
 layout rules

Layout in action[edit]

wrong wrong right right
 do first thing
 second thing
 third thing
 do first thing
  second thing
  third thing
 do first thing
    second thing 
    third thing
 do
   first thing
   second thing 
   third thing

Indent to the first[edit]

Remember that, due to the "golden rule of indentation" described above, although the keyword do tells Haskell to insert a curly brace, where the curly braces goes depends not on the do, but the thing that immediately follows it. For example, this weird-looking block of code is totally acceptable:

         do
first thing
second thing
third thing

As a result, you could also write combined if/do combination like this:

Wrong Right
 if foo
    then do first thing
         second thing
         third thing
    else do something else
 if foo
    then do 
     first thing
     second thing
     third thing
    else do something else

This is also the reason why you can write things like this

main = do
 first thing
 second thing

or

main = 
 do
   first thing
   second thing

instead of

main = 
 do first thing
    second thing

Either way is acceptable

if within do[edit]

This is a combination which trips up many Haskell programmers. Why does the following block of code not work?

-- why is this bad?
do first thing
   if condition
   then foo
   else bar
   third thing

Just to reiterate, the if then else block is not at fault for this problem. Instead, the issue is that the do block notices that the then part is indented to the same column as the if part, so it is not very happy, because from its point of view, it just found a new statement of the block. It is as if you had written the unsugared version on the right:

sweet (layout) unsweet
-- why is this bad?
do first thing
   if condition
   then foo
   else bar
   third thing
-- still bad, just explicitly so
do { first thing
   ; if condition
   ; then foo
   ; else bar
   ; third thing }

Naturally, the Haskell compiler is confused because it thinks that you never finished writing your if expression, before writing a new statement. The compiler sees that you have written something like if condition;, which is clearly bad, because it is unfinished. So, in order to fix this, we need to indent the bottom parts of this if block a little bit inwards

sweet (layout) unsweet
-- whew, fixed it!
do first thing
   if condition
     then foo
     else bar
   third thing
-- the fixed version without sugar
do { first thing
   ; if condition
      then foo
      else bar
   ; third thing }

This little bit of indentation prevents the do block from misinterpreting your then as a brand new expression. Of course, you might as well prefer to always add indentation before then and else, even when it is not really necessary. That wouldn't hurt legibility, and would avoid bad surprises like this one.

Exercises
The if-within-do issue has tripped up so many Haskellers that one programmer has posted a proposal to the Haskell prime initiative to add optional semicolons between if then else. How would that help?


Notes[edit]

  1. See section 2.7 of The Haskell Report (lexemes) on layout.