Haskell/Type declarations

From Wikibooks, open books for an open world
Jump to navigation Jump to search

You're not restricted to working with just the types provided by default with the language. There are many benefits to defining your own types:

  • Code can be written in terms of the problem being solved, making programs easier to design, write and understand.
  • Related pieces of data can be brought together in ways more convenient and meaningful than simply putting and getting values from lists or tuples.
  • Pattern matching and the type system can be used to their fullest extent by making them work with your custom types.

Haskell has three basic ways to declare a new type:

  • The data declaration, which defines new data types.
  • The type declaration for type synonyms, that is, alternative names for existing types.
  • The newtype declaration, which defines new data types equivalent to existing ones.

In this chapter, we will study data and type. In a later chapter, we will discuss newtype and see where it can be useful.

data and constructor functions[edit | edit source]

data is used to define new data types mostly using existing ones as building blocks. Here's a data structure for elements in a simple list of anniversaries:

data Anniversary = Birthday String Int Int Int       -- name, year, month, day
                 | Wedding String String Int Int Int -- spouse name 1, spouse name 2, year, month, day

This declares a new data type Anniversary, which can be either a Birthday or a Wedding. A Birthday contains one string and three integers and a Wedding contains two strings and three integers. The definitions of the two possibilities are separated by the vertical bar. The comments explain to readers of the code about the intended use of these new types. Moreover, with the declaration we also get two constructor functions for Anniversary; appropriately enough, they are called Birthday and Wedding. These functions provide a way to build a new Anniversary.

Types defined by data declarations are often referred to as algebraic data types, which is something we will address further in later chapters.

As usual with Haskell, the case of the first letter is important: type names and constructor functions must start with capital letters. Other than this syntactic detail, constructor functions work pretty much like the "conventional" functions we have met so far. In fact, if you use :t in GHCi to query the type of, say, Birthday, you'll get:

*Main> :t Birthday
Birthday :: String -> Int -> Int -> Int -> Anniversary

Meaning it's just a function which takes one String and three Int as arguments and evaluates to an Anniversary. This anniversary will contain the four arguments we passed as specified by the Birthday constructor.

Calling constructors is no different from calling other functions. For example, suppose we have John Smith born on 3rd July 1968:

johnSmith :: Anniversary
johnSmith = Birthday "John Smith" 1968 7 3

He married Jane Smith on 4th March 1987:

smithWedding :: Anniversary
smithWedding = Wedding "John Smith" "Jane Smith" 1987 3 4

These two anniversaries can, for instance, be put in a list:

anniversariesOfJohnSmith :: [Anniversary]
anniversariesOfJohnSmith = [johnSmith, smithWedding]

Or you could just as easily have called the constructors straight away when building the list (although the resulting code looks a bit cluttered).

anniversariesOfJohnSmith = [Birthday "John Smith" 1968 7 3, Wedding "John Smith" "Jane Smith" 1987 3 4]

Deconstructing types[edit | edit source]

To use our new data types, we must have a way to access their contents. For instance, one very basic operation with the anniversaries defined above would be extracting the names and dates they contain as a String. So we need a showAnniversary function (for the sake of code clarity, we used an auxiliary showDate function but let's ignore it for a moment):

showDate :: Int -> Int -> Int -> String
showDate y m d = show y ++ "-" ++ show m ++ "-" ++ show d

showAnniversary :: Anniversary -> String

showAnniversary (Birthday name year month day) =
   name ++ " born " ++ showDate year month day

showAnniversary (Wedding name1 name2 year month day) =
   name1 ++ " married " ++ name2 ++ " on " ++ showDate year month day

This example shows how we can deconstruct the values built in our data types. showAnniversary takes a single argument of type Anniversary. Instead of just providing a name for the argument on the left side of the definition, however, we specify one of the constructor functions and give names to each argument of the constructor (which correspond to the contents of the Anniversary). A more formal way of describing this "giving names" process is to say we are binding variables. "Binding" is being used in the sense of assigning a variable to each of the values so that we can refer to them on the right side of the function definition.

To handle both "Birthday" and "Wedding" Anniversaries, we needed to provide two function definitions, one for each constructor. When showAnniversary is called, if the argument is a Birthday Anniversary, the first definition is used and the variables name, year, month and day are bound to its contents. If the argument is a Wedding Anniversary, then the second definition is used and the variables are bound in the same way. This process of using a different version of the function depending on the type of constructor is pretty much like what happens when we use a case statement or define a function piece-wise.

Note that the parentheses around the constructor name and the bound variables are mandatory; otherwise the compiler or interpreter would not take them as a single argument. Also, it is important to have it absolutely clear that the expression inside the parentheses is not a call to the constructor function, even though it may look just like one.

Exercises

Note: The solution of this exercise is given near the end of the chapter, so we recommend that you attempt it before getting there.
Reread the function definitions above. Then look closer at the showDate helper function. We said it was provided "for the sake of code clarity", but there is a certain clumsiness in the way it is used. You have to pass three separate Int arguments to it, but these arguments are always linked to each other as part of a single date. It would make no sense to do things like passing the year, month and day values of the Anniversary in a different order, or to pass the month value twice and omit the day.

  • Could we use what we've seen in this chapter so far to reduce this clumsiness?
  • Declare a Date type which is composed of three Int, corresponding to year, month and day. Then, rewrite showDate so that it uses the new Date data type. What changes will then be needed in showAnniversary and the Anniversary for them to make use of Date?.

type for making type synonyms[edit | edit source]

As mentioned in the introduction of this module, code clarity is one of the motivations for using custom types. In that spirit, it could be nice to make it clear that the Strings in the Anniversary type are being used as names while still being able to manipulate them like ordinary Strings. This calls for a type declaration:

type Name = String

The code above says that a Name is now a synonym for a String. Any function that takes a String will now take a Name as well (and vice-versa: functions that take Name will accept any String). The right hand side of a type declaration can be a more complex type as well. For example, String itself is defined in the standard libraries as

type String = [Char]

We can do something similar for the list of anniversaries we made use of:

type AnniversaryBook = [Anniversary]

Type synonyms are mostly just a convenience. They help make the roles of types clearer or provide an alias to such things as complicated list or tuple types. It is largely a matter of personal discretion to decide how type synonyms should be deployed. Abuse of synonyms could make code confusing (for instance, picture a long program using multiple names for common types like Int or String simultaneously).

Incorporating the suggested type synonyms and the Date type we proposed in the exercise(*) of the previous section the code we've written so far looks like this:


((*) last chance to try that exercise without looking at the spoilers.)


type Name = String

data Anniversary = 
   Birthday Name Date
   | Wedding Name Name Date

data Date = Date Int Int Int   -- Year, Month, Day

johnSmith :: Anniversary
johnSmith = Birthday "John Smith" (Date 1968 7 3)

smithWedding :: Anniversary
smithWedding = Wedding "John Smith" "Jane Smith" (Date 1987 3 4)

type AnniversaryBook = [Anniversary]

anniversariesOfJohnSmith :: AnniversaryBook
anniversariesOfJohnSmith = [johnSmith, smithWedding]

showDate :: Date -> String
showDate (Date y m d) = show y ++ "-" ++ show m ++ "-" ++ show d 

showAnniversary :: Anniversary -> String
showAnniversary (Birthday name date) =
   name ++ " born " ++ showDate date
showAnniversary (Wedding name1 name2 date) =
   name1 ++ " married " ++ name2 ++ " on " ++ showDate date

Even in a simple example like this one, there is a noticeable gain in simplicity and clarity compared to the same task using only Ints, Strings, and corresponding lists.

Note that the Date type has a constructor function which is called Date as well. That is perfectly valid and indeed giving the constructor the same name of the type when there is just one constructor is good practice, as a simple way of making the role of the function obvious.

Note

After these initial examples, the mechanics of using constructor functions may look a bit unwieldy, particularly if you're familiar with analogous features in other languages. There are syntactical constructs that make dealing with constructors more convenient. These will be dealt with later on, when we return to the topic of constructors and data types to explore them in detail.