Write Yourself a Scheme in 48 Hours/First Steps
First, you'll need to install GHC. On Linux, it's often pre-installed or available via the package manager (
yum for example, depending on your distribution). It's also downloadable from http://www.haskell.org/ghc/. A binary package is probably easiest, unless you really know what you're doing. It should download and install like any other software package. This tutorial was developed on Linux, but everything should also work on Windows, as long as you know how to use the command line, or on the Macintosh from within the Terminal.
For UNIX (or Windows Emacs) users, there is a pretty good Emacs mode, including syntax highlighting and automatic indentation. Windows users can use Notepad or any other text editor: Haskell syntax is fairly Notepad-friendly, though you have to be careful with the indentation. Eclipse users might want to try the eclipsefp plug-in. Finally, there's also a Haskell plugin for Visual Studio using the GHC compiler.
Now, it's time for your first Haskell program. This program will read a name off the command line and then print a greeting. Create a file ending in ".hs" and type the following text. Be sure to get the indentation right, or else it may not compile.
module Main where import System.Environment main :: IO () main = do args <- getArgs putStrLn ("Hello, " ++ args !! 0)
Let's go through this code. The first two lines specify that we'll be creating a module named Main that imports the System module. Every Haskell program begins with an action called
main in a module named
Main. That module may import others, but it must be present for the compiler to generate an executable file. Haskell is case-sensitive: module names are always capitalized, definitions always uncapitalized.
main :: IO () is a type declaration: it says that
main is of type
IO (), which is an IO action carrying along values of unit type
(). A unit type allows only one value, also denoted
(), thus holding no information. Type declarations in Haskell are optional: the compiler figures them out automatically, and only complains if they differ from what you've specified. In this tutorial, I specify the types of all declarations explicitly, for clarity. If you're following along at home, you may want to omit them, because it's less to change as we build our program.
The IO type is an instance of something called a monad, which is a scary name for a not-so-scary concept. Basically, a monad is a way of saying "we'll be carrying along and combining, in some specific manner, values with some extra information attached to them, which most functions don't need to worry about". How we carry along this extra information and combine our values is what makes the particular Monad type to be what it is; underlying values might get changed and converted from one type into another by regular functions (invoked by actions), oblivious to the extra stuff that goes on around them, but the "pipe" (the value-propagation mechanism) stays the same.
In this example, the "extra information" is the IO actions to be performed using the carried along values, and the final basic value is nothing, represented as
IO [String] and
IO () belong to the same
IO monad type with different basic types, meaning they are IO actions acting upon, and passing along values of different types,
(). Monadic values thus combined with the basic values packaged inside them are often called "actions", because the easiest way to think about the IO monad is a sequencing of actions each potentially acting on the passed along basic values, affecting the outside world.
Haskell is a functional language: instead of giving the computer a sequence of instructions to carry out, you give it a collection of definitions that tell it how to perform every function it might need. These definitions use various compositions of actions and functions. The compiler figures out an execution path that puts everything together.
To write one of these definitions, you set it up as an equation. The left hand side defines a name, and optionally one or more patterns (explained later) that will bind variables. The right hand side defines some composition of other definitions that tells the computer what to do when it encounters the name. These equations behave just like ordinary equations in algebra: you can always substitute the right hand side for the left within the text of the program, and it'll evaluate to the same value. Called "referential transparency", this property makes it significantly easier to reason about Haskell programs than other languages.
How will we define our
main action? We know that it must be an
IO () action, which we want to read the command line args and print some output, producing
(), or nothing of value, eventually.
There are two ways to create an IO action (either directly or by calling a function that performs them):
- Lift an ordinary value into the IO monad, using the
- Combine two existing IO actions.
Since we want to do two things, we'll take the second approach. The built-in action getArgs reads the command-line arguments and passes them along as a list of strings. The built-in function putStrLn takes a string and creates an action that writes this string to the console.
To combine these actions, we use a do-block. A do-block consists of a series of lines, all lined up with the first non-whitespace character after the do. Each line can have one of two forms:
- name <- action1
The first form binds the result of action1 to name, to be available in next actions. For example, if the type of action1 is
IO [String] (an IO action returning a list of strings, as with
getArgs), then name will be bound in all the subsequent actions to the list of strings thus passed along, through the use of "bind" operator
>>= . The second form just executes action2, sequencing it with the next line (should there be one) through the
>> operator. The bind operator has different semantics for each monad: in the case of the IO monad, it executes the actions sequentially, performing whatever external side-effects that result from actions. Because the semantics of this composition depend upon the particular monad used, you cannot mix actions of different monad types in the same do-block - only
IO monad can be used (it's all in the same "pipe").
Of course, these actions may themselves call functions or complicated expressions, passing along their results (either by calling the
return function, or some other function that eventually does so). In this example, we first take the first element of the argument list (at index 0,
args !! 0), concatenate it onto the end of the string "Hello, " (
"Hello, " ++), and finally pass that to
putStrLn which creates new IO action, participating in the do-block sequencing.
A new action thus created, which is a combined sequence of actions as described above, is stored in the identifier
main of type
IO (). The Haskell system notices this definition, and executes the action in it.
Strings are lists of characters in Haskell, so you can use any of the list functions and operators on them. A full table of the standard operators and their precedences follows:
|^, ^^, **||8||Right||Exponentiation (integer, fractional, and floating-point)|
|*, /||7||Left||Multiplication, Division|
|+, -||6||Left||Addition, Subtraction|
|:||5||Right||Cons (list construction)|
|`elem`, `notElem`||4||Left||List Membership|
|==, /=, <, <=, >=,>||4||Left||Equals, Not-equals, and other relation operators|
|>>, >>=||1||Left||Monadic Bind, Monadic Bind (piping value to next function)|
|=<<||1||Right||Reverse Monadic Bind (same as above, but arguments reversed)|
|$||0||Right||Infix Function Application (same as "f x",
but right-associative instead of left)
To compile and run the program, try something like this:
debian:/home/jdtang/haskell_tutorial/code# ghc -o hello_you --make listing2.hs debian:/home/jdtang/haskell_tutorial/code# ./hello_you Jonathan Hello, Jonathan
-o option specifies the name of the executable you want to create, and then you just specify the name of the Haskell source file.