Haskell/Control structures

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Haskell offers several ways of expressing a choice between different values. We explored some of them in the Haskell Basics chapters. This section will bring together what we have seen thus far, discuss some finer points, and introduce a new control structure.

if and guards revisited

[edit | edit source]

We have already met these constructs. The syntax for if expressions is:

if <condition> then <true-value> else <false-value>

<condition> is an expression which evaluates to a boolean. If the <condition> is True then the <true-value> is returned, otherwise the <false-value> is returned. Note that in Haskell if is an expression (which is converted to a value) and not a statement (which is executed) as in many imperative languages.[1] As a consequence, the else is mandatory in Haskell. Since if is an expression, it must evaluate to a result whether the condition is true or false, and the else ensures this. Furthermore, <true-value> and <false-value> must evaluate to the same type, which will be the type of the whole if expression.

When if expressions are split across multiple lines, they are usually indented by aligning elses with thens, rather than with ifs. A common style looks like this:

describeLetter :: Char -> String
describeLetter c =
    if c >= 'a' && c <= 'z'
        then "Lower case"
        else if c >= 'A' && c <= 'Z'
            then "Upper case"
            else "Not an ASCII letter"

Guards and top-level if expressions are mostly interchangeable. With guards, the example above is a little neater:

describeLetter :: Char -> String
describeLetter c
   | c >= 'a' && c <= 'z' = "Lower case"
   | c >= 'A' && c <= 'Z' = "Upper case"
   | otherwise            = "Not an ASCII letter"

Remember that otherwise is just an alias to True, and thus the last guard is a catch-all, playing the role of the final else of the if expression.

Guards are evaluated in the order they appear. Consider a set up like the following:

f (pattern1) | predicate1 = w
             | predicate2 = x

f (pattern2) | predicate3 = y
             | predicate4 = z

Here, the argument of f will be pattern-matched against pattern1. If it succeeds, then we proceed to the first set of guards: if predicate1 evaluates to True, then w is returned. If not, then predicate2 is evaluated; and if it is true x is returned. Again, if not, then we proceed to the next case and try to match the argument against pattern2, repeating the guards procedure with predicate3 and predicate4. (Of course, if neither pattern matches or neither predicate is true for the matching pattern there will be a runtime error. Regardless of the chosen control structure, it is important to ensure all cases are covered.)

Embedding if expressions

[edit | edit source]

A handy consequence of if constructs being expressions is that they can be placed anywhere a Haskell expression could be, allowing us to write code like this:

g x y = (if x == 0 then 1 else sin x / x) * y

Note that we wrote the if expression without line breaks for maximum terseness. Unlike if expressions, guard blocks are not expressions; and so a let or a where definition is the closest we can get to this style when using them. Needless to say, more complicated one-line if expressions would be hard to read, making let and where attractive options in such cases.

Short-circuit operators

[edit | edit source]

The || and && operators mentioned before are in fact control structures: they evaluate the first argument and then the second argument only if needed.

Avoiding excessive effort

[edit | edit source]

For instance, suppose a large number n is to be checked to determine if it is a prime number and a function isPrime is available, but alas, it requires a lot of computation to evaluate. Using the function \n -> n == 2 || (n `mod` 2 /= 0 && isPrime n) will help if there are to be many evaluations with even values of n.

Avoidance of error conditions

[edit | edit source]

&& can be used to avoid signalling a run-time error, such as divide-by-zero or index-out-of-bounds, etc. For instance, the following locates the last non-zero element of a list:

lastNonZero a = go a (length a-1)
  where
    go a l | l >= 0 && a !! l == 0 = go a (l-1)
           | l < 0  = Nothing
           | otherwise = Just (a !! l)

Should all elements of the list be zero, the loop will work down to l = -1, and in this case the condition in the first guard will be evaluated without attempting to dereference element -1, which does not exist.

case expressions

[edit | edit source]

One control structure we haven't talked about yet is case expressions. They are to piece-wise function definitions what if expressions are to guards. Take this simple piece-wise definition:

f 0 = 18
f 1 = 15
f 2 = 12
f x = 12 - x

It is equivalent to - and, indeed, syntactic sugar for:

f x =
    case x of
        0 -> 18
        1 -> 15
        2 -> 12
        _ -> 12 - x

Whatever definition we pick, the same happens when f is called: The argument x is matched against all of the patterns in order, and on the first match the expression on the right-hand side of the corresponding equal sign (in the piece-wise version) or arrow (in the case version) is evaluated. Note that in this case expression there is no need to write x in the pattern; the wildcard pattern _ gives the same effect.[2]

Indentation is important when using case. The cases must be indented further to the right than the beginning of the line containing the of keyword, and all cases must have the same indentation. For the sake of illustration, here are two other valid layouts for a case expression:

f x = case x of
    0 -> 18
    1 -> 15
    2 -> 12
    _ -> 12 - x


f x = case x of 0 -> 18
                1 -> 15
                2 -> 12
                _ -> 12 - x

Since the left hand side of any case branch is just a pattern, it can also be used for binding, exactly like in piece-wise function definitions:[3]

describeString :: String -> String
describeString str =
  case str of
    (x:xs) -> "The first character of the string is: " ++ [x] ++ "; and " ++
              "there are " ++ show (length xs) ++ " more characters in it."
    []     -> "This is an empty string."

This function describes some properties of str using a human-readable string. Using case syntax to bind variables to the head and tail of our list is convenient here, but you could also do this with an if-expression (with a condition of null str to pick the empty string case).

Finally, just like if expressions (and unlike piece-wise definitions), case expressions can be embedded anywhere another expression would fit:

data Colour = Black | White | RGB Int Int Int

describeBlackOrWhite :: Colour -> String
describeBlackOrWhite c =
  "This colour is"
  ++ case c of
       Black           -> " black"
       White           -> " white"
       RGB 0 0 0       -> " black"
       RGB 255 255 255 -> " white"
       _               -> "... uh... something else"
  ++ ", yeah?"

The case block above fits in as any string would. Writing describeBlackOrWhite this way makes let/where unnecessary (although the resulting definition is not as readable).

Exercises
Use a case expression to implement a fakeIf function which could be used as a replacement to the familiar if expressions.

Controlling actions, revisited

[edit | edit source]

In the final part of this chapter, we will introduce a few extra points about control structures while revisiting the discussions in the "Simple input and output" chapter. There, in the Controlling actions section, we used the following function to show how to execute actions conditionally within a do block using if expressions:

doGuessing num = do
   putStrLn "Enter your guess:"
   guess <- getLine
   if (read guess) < num
     then do putStrLn "Too low!"
             doGuessing num
     else if (read guess) > num
            then do putStrLn "Too high!"
                    doGuessing num
            else putStrLn "You Win!"

We can write the same doGuessing function using a case expression. To do this, we first introduce the Prelude function compare which takes two values of the same type (in the Ord class) and returns a value of type Ordering — namely one of GT, LT, EQ, depending on whether the first is greater than, less than, or equal to the second.

doGuessing num = do
  putStrLn "Enter your guess:"
  guess <- getLine
  case compare (read guess) num of
    LT -> do putStrLn "Too low!"
             doGuessing num
    GT -> do putStrLn "Too high!"
             doGuessing num
    EQ -> putStrLn "You Win!"

The dos after the ->s are necessary on the first two options, because we are sequencing actions within each case.

A note about return

[edit | edit source]

Now, we are going to dispel a possible source of confusion. In a typical imperative language (C, for example) an implementation of doGuessing might look like the following (if you don't know C, don't worry with the details, just follow the if-else chain):

void doGuessing(int num) {
  printf("Enter your guess:");
  int guess = atoi(readLine());
  if (guess == num) {
    printf("You win!\n");
    return;
  }

  // we won't get here if guess == num
  if (guess < num) {
    printf("Too low!\n");
    doGuessing(num);
  } else {
    printf("Too high!\n");
    doGuessing(num);
  }
}

This doGuessing first tests the equality case, which does not lead to a new call of doGuessing, and the if has no accompanying else. If the guess was right, a return statement is used to exit the function at once, skipping the other cases. Now, going back to Haskell, action sequencing in do blocks looks a lot like imperative code, and furthermore there actually is a return in Prelude. Then, knowing that case expressions (unlike if expressions) do not force us to cover all cases, one might be tempted to write a literal translation of the C code above (try running it if you are curious)...

doGuessing num = do
  putStrLn "Enter your guess:"
  guess <- getLine
  case compare (read guess) num of
    EQ -> do putStrLn "You win!"
             return ()

  -- we don't expect to get here if guess == num
  if (read guess < num)
    then do putStrLn "Too low!"
            doGuessing num
    else do putStrLn "Too high!"
            doGuessing num

... but it won't work! If you guess correctly, the function will first print "You win!," but it will not exit at the return (). Instead, the program will continue to the if expression and check whether guess is less than num. Of course it is not, so the else branch is taken, and it will print "Too high!" and then ask you to guess again. Things aren't any better with an incorrect guess: it will try to evaluate the case expression and get either LT or GT as the result of the compare. In either case, it won't have a pattern that matches, and the program will fail immediately with an exception (as usual, the incomplete case alone should be enough to raise suspicion).

The problem here is that return is not at all equivalent to the C (or Java etc.) statement with the same name. For our immediate purposes, we can say that return is a function.[4] The return () in particular evaluates to an action which does nothing. return does not affect the control flow at all. In the correct guess case, the case expression evaluates to return (), an action of type IO (), and execution just follows along normally.

The bottom line is that while actions and do blocks resemble imperative code, they must be dealt with on their own terms - Haskell terms.

Exercises
  1. Redo the "Haskell greeting" exercise in Simple input and output/Controlling actions, this time using a case expression.
  2. What does the following program print out? And why?
main =
 do x <- getX
    putStrLn x

getX =
 do return "My Shangri-La"
    return "beneath"
    return "the summer moon"
    return "I will"
    return "return"
    return "again"


Notes

  1. If you have programmed in C or Java, you will recognize Haskell's if/then/else as an equivalent to the ternary conditional operator ?: .
  2. To see why this is so, consider our discussion of matching and binding in the Pattern matching section
  3. Thus, case expressions are a lot more versatile than most of the superficially similar switch/case statements in imperative languages which are typically restricted to equality tests on integral primitive types.
  4. Superfluous note: somewhat closer to a proper explanation, we might say return is a function which takes a value and makes it into an action which, when evaluated, gives the original value. A return "strawberry" within one of the do blocks we are dealing with would have type IO String - the same type as getLine. Do not worry if that doesn't make sense for now; you will understand what return really does when we actually start discussing monads further ahead on the book.