Haskell/Foldable

From Wikibooks, open books for an open world
< Haskell
Jump to: navigation, search

The Foldable type class provides a generalisation of list folding (foldr and friends) and operations derived from it to arbitrary data structures. Besides being extremely useful, Foldable is a great example of how monoids can help formulating good abstractions.

Deconstructing foldr[edit]

foldr is quite a busy function − two binary functions on each side of the first function arrow, with types which use two variables each.

foldr :: (a -> b -> b) -> b -> [a] -> b

If we are going to generalise foldr, it would be convenient to have something simpler to work with, or at least to be able to break it down into simpler components. What could those components be?

A rough description of list folding would be that it consists of running through the list elements and combining them with a binary function. We happen to know one type class which is all about combining pairs of values: Monoid. If we take foldr f z ...

a `f` (b `f` (c `f` z)) -- foldr f z [a,b,c]

... and make f = (<>) and z = mempty ...

a <> (b <> (c <> mempty)) -- foldr (<>) mempty [a,b,c]

... we get mconcat = foldr mappend mempty, which is a simpler, specialised foldr in which we do not need to specify the combining function nor initial accumulator, as we simply use mappend (i.e. (<>)) and mempty:

mconcat :: Monoid m => [m] -> m

mconcat captures the combine-all-elements aspect of foldr well enough, and covers a few of its use cases:

GHCi> mconcat ["Tree", "fingers"] -- concat
"Treefingers"

Neat − but surely we don't want to be restricted to folding with Monoid instances only. One way to improve the situation a bit is by realising we can use mconcat to fold a list with elements of any type, as long as we have a function to convert them to some Monoid type:

foldMap :: Monoid m => (a -> m) -> [a] -> m
foldMap g = mconcat . fmap g

That makes things more interesting already:

GHCi> foldMap Sum [1..10]
Sum {getSum = 55}

So far so good, but it seems that we are still unable to fold with arbitrary combining functions. It turns out, however, that any binary function that fits the foldr signature can be used to convert values to a Monoid type! The trick is looking at the combining function passed to foldr as a function of one argument...

foldr :: (a -> (b -> b)) -> b -> [a] -> b

... and taking advantage of the fact that b -> b functions form a monoid under composition, with (.) as mappend and id as mempty [1]. The corresponding Monoid instance is available through the Endo wrapper from Data.Monoid [2]:

newtype Endo b = Endo { appEndo :: b -> b }

instance Monoid Endo where
    mempty                  = Endo id
    Endo g `mappend` Endo f = Endo (g . f)

We can now define...

foldComposing :: (a -> (b -> b)) -> [a] -> Endo b
foldComposing f = foldMap (Endo . f)

... which makes a b -> b function out of each element and composes them all:

Endo (f a) <> (Endo (f b) <> (Endo (f c) <> (Endo id))) -- foldComposing f [a,b,c]
Endo (f a . (f b . (f c . id))) 
-- (<>) and (.) are associative, so we don't actually need the parentheses.

-- As an example, here is a step-by-step evaluation:
foldComposing (+) [1, 2, 3]
foldMap (Endo . (+)) [1, 2, 3]
mconcat (fmap (Endo . (+)) [1, 2, 3]) 
mconcat (fmap Endo [(+1), (+2), (+3)])
mconcat [Endo (+1), Endo (+2), Endo (+3)]
Endo ((+1) . (+2) . (+3))
Endo (+6)

If we apply that function to some b value...

foldr :: (a -> (b -> b)) -> b -> [a] -> b
foldr f z xs = appEndo (foldComposing f xs) z

...we finally recover foldr. That means we can define foldr in terms of foldMap, a function which is much simpler and therefore easier to reason about. For that reason, foldMap is the conceptual heart of Foldable, the class which generalises foldr to arbitrary data structures.

Exercises
  1. Write two implementations of foldMap for lists: one in terms of foldr and the other using recursion explicitly.

The Foldable class[edit]

Implementing Foldable for a data structure requires writing just one function: either foldMap or foldr. Foldable, however, has a lot of other methods:

-- Abridged definition, with just the method signatures.
class Foldable t where
    foldMap :: Monoid m => (a -> m) -> t a -> m
    foldr :: (a -> b -> b) -> b -> t a -> b

    -- All of the following have default implementations:
    fold :: Monoid m => t m -> m -- generalised mconcat
    foldr' :: (a -> b -> b) -> b -> t a -> b
    foldl :: (b -> a -> b) -> b -> t a -> b
    foldl' :: (b -> a -> b) -> b -> t a -> b
    foldr1 :: (a -> a -> a) -> t a -> a
    foldl1 :: (a -> a -> a) -> t a -> a
    toList :: t a -> [a]
    null :: t a -> Bool
    length :: t a -> Int
    elem :: Eq a => a -> t a -> Bool
    maximum :: Ord a => t a -> a
    minimum :: Ord a => t a -> a
    sum :: Num a => t a -> a
    product :: Num a => t a -> a

The extra methods are there so that more efficient implementations can be written if necessary. In any case, writing just foldMap or foldr gives you all of the very useful functions listed above for free. And it gets even better: Data.Foldable provides still more functions generalised to any Foldable, including, remarkably, mapM_/traverse_.

Here is a quick demonstration of Foldable using Data.Map [3]:

GHCi> import qualified Data.Map as M
GHCi> let testMap = M.fromList $ zip [0..] ["Yesterday","I","woke","up","sucking","a","lemon"]
GHCi> length testMap 
7
GHCi> sum . fmap length $ testMap 
29
GHCi> elem "lemon" testMap 
True
GHCi> foldr1 (\x y -> x ++ (' ' : y)) testMap -- Be careful: foldr1 is partial!
"Yesterday I woke up sucking a lemon"
GHCi> import Data.Foldable
GHCi> traverse_ putStrLn testMap 
Yesterday
I
woke
up
sucking
a
lemon

Beyond providing useful generalisations, Foldable and foldMap suggest a more declarative way of thinking about folds. For instance, instead of describing sum as a function which runs across a list (or tree, or whatever the data structure is) accumulating its elements with (+), we might say that it queries each element for its value and summarises the results of the queries using the Sum monoid. Though the difference may seem small, the monoidal summary perspective can help clarifying problems involving folds by separating the core issue of what sort of result we wish to obtain from the details of the data structure being folded.

Exercises
  1. Let's play Spot The Monoid! Here are the rules:

    For each function, suggest a combination of mempty, mappend and, if necessary, a function to prepare the values that would allow it to be implemented with fold or foldMap. No need to bother with newtype instances (unless you want to test your solutions with foldMap, of course) − for example, "mempty is 0 and mappend is (+)" would be a perfectly acceptable answer for sum. If necessary, you can partially apply the functions and use the supplied arguments in the answers. Do not answer every question with id and (.) - that would be cheating!

    (Hint: if you need suggestions, have a look at the Monoid instances in Data.Monoid.)

    1. product :: (Foldable t, Num a) => t a -> a
    2. concat :: Foldable t => t [a] -> [a]
    3. concatMap :: Foldable t => (a -> [b]) -> t a -> [b]
    4. all :: Foldable t => (a -> Bool) -> t a -> Bool
    5. elem :: Eq a => a -> t a -> Bool
    6. length :: t a -> Int
    7. traverse_ :: (Foldable t, Applicative f) =>
      (a -> f b) -> t a -> f ()
    8. mapM_ :: (Foldable t, Monad m) =>
      (a -> m b) -> t a -> m ()
    9. safeMaximum :: Ord a => t a -> Maybe a
      (like maximum, but handling emptiness.)
    10. find :: Foldable t => (a -> Bool) -> t a -> Maybe a
    11. composeL :: Foldable t =>
      (b -> a -> b) -> t a -> b -> b

      (equivalent to foldl.)

List-like folding[edit]

Foldable includes the toList :: Foldable t => t a -> [a] method. That means any Foldable data structure can be turned into a list; moreover, folding the resulting list will produce the same results than folding the original structure directly. A possible toList implementation in terms of foldMap would be [4]:

toList = foldMap (\x -> [x])

toList reflects the fact that lists are the free monoid for Haskell types. "Free" here means any value can be promoted to the monoid in a way which neither adds nor erases any information (we can convert values of type a to [a] lists with a single element and back through (\x->[x]) and head in a lossless way) [5].

A related key trait of Foldable is made obvious by toList. Since toList = id for lists, if you are given a function defined as...

-- Given a list xs :: [a] 
xsAsFoldMap :: Monoid m => (a -> m) -> m
xsAsFoldMap = \f -> foldMap f xs

... it is always possible to recover the original list xs by supplying (\x->[x]) to xsAsFoldMap. In this sense, lists are equivalent to their right folds. That implies folding with the Foldable operations will unavoidably be a lossy operation if the data structure is more complex than a list. Putting it in another way, we can say that the list-like folds offered by Foldable are less general than folds of the sort we have seen back in Other data structures (formally known as catamorphisms), which do make it possible to reconstruct the original structure.

Exercises
  1. This exercise concerns the tree type we used in Other data structures:

    data Tree a = Leaf a | Branch (Tree a) (Tree a)

    1. Write a Foldable instance for Tree.
    2. Implement treeDepth :: Tree a -> Int, which gives the number of branches from the root of the tree to the furthest leaf. Use either the Foldable or the treeFold catamorphism defined in Other data structures. Are both suggestions actually possible?

More facts about Foldable[edit]

Foldable is slightly unusual among Haskell classes which are both principled and general-purpose in that it has no laws of its own. The closest thing is the following property, which strictly speaking is not a law (as it is guaranteed to hold whatever the instance is): given a monoid homomorphism g,

foldMap (g . f) = g . foldMap f

Switching from foldMap (g . f) to g . foldMap f can be advantageous, as it means applying g only to the result of the fold, rather than to the potentially many elements in the structure being folded.

If the Foldable structure is a Functor as well, it also automatically holds that...

foldMap f = fold . fmap f

... and thus we get, after applying the second functor law and the property just above:

foldMap g . fmap f = foldMap (g . f) = g . foldMap f

Though the presence of a method such as foldMap might suggest that any Foldable types should have Functor instances as well, Functor is not actually a superclass of Foldable. That makes it possible to give Foldable instances to structures that, for whatever reason, cannot be Functors. The most common example are the sets from Data.Set. Element types for those sets must be instances of Ord, and therefore their map function cannot be used as fmap, which has no additional class constraints. That, however, does not deny Data.Set.Set an useful Foldable instance.

GHCi> import qualified Data.Set as S
GHCi> let testSet = S.fromList [1,3,2,5,5,0]
GHCi> testSet 
fromList [0,1,2,3,5]
GHCi> import Data.Foldable
GHCi> toList testSet
[0,1,2,3,5]
GHCi> foldMap show testSet 
"01235"
Exercises
    1. Write the monoid instance for pairs,

      (Monoid a, Monoid b) => Monoid (a,b)

    2. Prove that fst and snd are monoid homomorphisms.
    3. Use the monoid homomorphism property of foldMap presented above to prove that
      foldMap f &&& foldMap g = foldMap (f &&& g)
      where
      f &&& g = \x -> (f x, g x)

    This exercise is based on a message by Edward Kmett [6].

Notes

  1. This trick will probably ring familiar if you did the exercise about foldl at the end of Higher order functions.
  2. "Endo" is shorthand for "endomorphism", a jargony word for functions from one type to the same type.
  3. For more information on Data.Map and other useful data structure implementations with, see the data structures primer.
  4. Data.Foldable uses a different default implementation for performance reasons.
  5. There is one caveat relating to non-termination with regards to saying lists form a free monoid. For details, see the Free Monoids in Haskell post by Dan Doel. (Note that the discussion there is quite advanced. You might not enjoy it much right now if you have just been introduced to Foldable.)
  6. Source (Haskell Café): [1]