Common Lisp/First steps/Experienced tutorial

Basic Operations

This chapter gives some theoretical basics about structure of Lisp programs.

Syntax

Lisp operates on forms. Each Lisp form is either an atom or a list of forms. Atoms are numbers, strings, symbols and some other structures. Lisp symbols are actually quite interesting - I'll talk about them in another section.

When Lisp is forced to evaluate the form it looks whether it's an atom or a list. If it's an atom then its value is returned (numbers, strings and other data return themselves, symbols return their value). If the form is a list, Lisp looks at the first element of the list, which is called its car (an archaic term which stands for Contents of the Address part of Register). The car of a list should be a symbol or a lambda-expression (lambda-expressions would be discussed later). If it's a symbol Lisp takes its function (the function associated to that symbol - NOT its value) and executes that function with the arguments taken from the rest of the list (if it contains forms they're evaluated as well).

Example: (+ 1 2 3) returns 6. Symbol "+" is associated with the function + that performs the addition of its arguments. (+ 1 (+ 2 3) 4) returns 10. The second argument contains a form and is evaluated before being passed to the outer +.

Some interesting functions

+, -, *, / are basic operations on numbers. They can accept multiple arguments. Note that (/ 1 2) is 1/2, not 0.5 - Lisp knows rational numbers (as well as complex ones...). <, <=, = and so on are used for number comparison. Note that =, <, <= and others are polyadic:

 (= 1 1 1) ⇒ t
 (= 1 1 2) ⇒ nil
 (< 1 2 3) ⇒ t
 (< 1 3 2) ⇒ nil

list as the name suggests creates a list.

 (list 1 2 3) ⇒ (1 2 3)

cons creates a pair (a pair has 2 elements and is not a list).

 (cons 1 2) ⇒ (1 . 2) ;;note the dot.

car or first returns the first element of the cons (pair). cdr or rest returns the second element of the cons.

 (car (cons 1 2)) ⇒ 1  (cdr (cons 1 2)) ⇒ 2

or

 (first '(1 2)) ⇒ 1

 (second '(1 2)) ⇒ 2

Lists and conses

Since lists are so prominent in Lisp it's good to know what they exactly are. The truth is that, with one exception, lists consist of conses. This exception is a special list called nil - also known as (). nil is a self-evaluating symbol that is used both as a falsehood constant and an empty list. nil is the only false value in Lisp - everything else is true for the purpose of if and similar constructs. The opposite of nil is t, which is also self-evaluating and represents the truth. t is, however, not a list. Let's return to the lists, then... A proper list (improper lists will not be explained here) is defined as any list that is either nil or a cons whose cdr is a proper list. (Note that since proper lists have to start from somewhere, (cdr (cdr (cdr... (cdr x)))...) is nil for some finite number of cdrs.)

Basically, a proper list is a sequence of conses, such that the next cons is a cdr of a previous cons. It's easy to understand how lists are constructed if you consider a graphical representation. A cons can be represented as a rectangle divided into two squares. Each of squares can hold a value. In a proper list left square holds the element of the list, and the right square holds the next cons (or nil if it is the end of the list). Note that each cons holds exactly one element of the list. That's how (1 2 3) would look in graphical representation:

.-------.
| * | * |
'-|---|-'
  V   V
  1 .-------.
    | * | * |
    '-|---|-'
      V   V
      2 .-------.
        | * | * |
        '-|---|-'
          V   V
          3  nil

So (1 2 3) is really (1 . (2 . (3 . nil))). What follows from that is that (car (list 1 2 3)) is 1 and (cdr (list 1 2 3)) is (2 3). What doesn't follow from all of the above is that (car nil) and (cdr nil) are nil. This is not very consistent, because (cons nil nil) is not the same thing as nil, but it happens to be convenient.

Symbols

Symbols play the same role as variable names in other programming languages. Basically a symbol is a string associated with some values. The string can consist of any characters, including spaces and control characters. However, most symbol names do not use characters other than letters, numbers, and hyphens, because they're awkward to type. In addition, the characters "(", ")", "#", "\", ".", "|", ";", whitespace, and the double and single quote marks are likely to be misunderstood by the lisp reader; and other characters such as "*" are conventionally used only for certain purposes. By default Lisp converts what you type to uppercase.

Symbols are created as you use them. For example, when you type (setf x 1) symbol called "X" is created (remember that Lisp uppercases your input) and its value is set to 1. However it is good style to define your symbols before you use them. defvar and defparameter are used for that purpose.

(defparameter x 1) ;;defines symbol "X" and sets its value to 1.

A symbol can also have other parameters associated with it besides its name and value - functions, classes and so on. To get a function associated with a symbol, a special operator (these are discussed in the next chapter) function is used.

Macros and special operators

There are operators in Lisp that look like functions, but behave slightly differently. These are macros and special operators. Functions always evaluate their arguments, but occasionally this is undesirable, so these forms must be implemented.

For example, consider the ubiquitous if construct. If takes the form (if condition then else); condition is first evaluated, followed by then if condition is not nil or else if it is nil. Thus, (if t 1 2) returns 1 and (if nil 1 2) returns 2. Obviously, if cannot be implemented as a function because only one of its two final arguments will be evaluated. Thus, it is created as one of about 25 special operators, which are all predefined in the Lisp implementation.

Another special operator is quote. It returns its only argument, unevaluated. Again, this is impossible with functions, as they always evaluate their arguments. Quote is used quite often, so it can be expressed by the single character ' . Thus, (quote x) is equivalent to 'x. quote may be used to quickly create lists: '(1 2 3) returns (1 2 3), '(x y z) returns (x y z) - compare that to (list x y z) which would create a list of values of x, y and z, or signal an error if no values were assigned. In fact, '(x y z) is the same value as (list 'x 'y 'z).

Macros are like special operators, but they're not hardcoded in a Lisp implementation. Instead, they can be defined in Lisp code. A lot of Lisp constructs you will be using are actually macros. Only very essential constructs are hardcoded. Of course, to the user there is no difference.

Simple programming

In this chapter I'll explain how to do simple things in Lisp. Many useful constructs will be introduced. After reading this chapter, you should be able to write simple programs.

Storing values

While storing values in variables is an important process in many programming languages, in Lisp it is used much less often. Although it is multi-paradigm, Lisp is often referred to and programmed in as a functional language. Functional languages do not allow (or at least discourage) the use of state, or stored information that implicitly changes the behavior of a function. In theory, assignments are never needed in a purely functional program. As you may have noticed, I never stored values anywhere in the previous chapter, except in the "Symbols" section. This shows that the storing of global values is rarely needed.

That being said, it is still useful to store values, and Lisp does provide for this. The macros setf and setq store values into symbols:

 (setq x 1) => 1
 x => 1
 (setq x 1 y 2 z 3) => 3
 (list x y z) => (1 2 3)

Setf is much more powerful than setq, as it allows the programmer to change a single part of a variable:

 (setq abc '(1 2 3)) => (1 2 3)
 (setq (car abc) 3) => error!
 (setf (car abc) 3) => 3
 abc => (3 2 3)

For this reason, setf is used more often than setq.

In other languages, assignments are usually denoted as something like x=1. The = here means something different than the mathematical = sign. Lisp reserves = for the mathematical definition, i.e. testing for numerical equality. It may appear that (setf place value) is more complicated than needed, but it is useful to remember that this facility is extensible and allows the user to remove the internal representation of data from the assignment. This is similar to reassigning the = operator in other languages (something not possible in C and some other languages).

Setf is a useful utility, despite the extra keystrokes it requires. In practice, though, you'll be using assignment less often than in other languages, because there is another way in Lisp to remember values: by binding them.

Note: setq stands for set quote. Originally, a set function existed that would evaluate its first argument. Programmers got tired of quoting the argument, and defined this special operator. Set is now deprecated.

Binding values

When values are bound to symbols, they are temporarily stored and then unbound, the values forgotten. With let and let* you can bind some values to some variables within some part of your program. The difference between let and let* is that let initialises its variables in parallel and let* does it sequentially.

 (let ((x 1) (y 2) (z 3)) (+ x y z)) => 6
 (let* ((x 1) (y (+ x 1)) (z (+ y 1))) (+ x y z)) => 6

Inside the body of let you can use variables you defined as though they're real symbols - outside of let these symbols can be unbound or have completely different values. If you call or define functions within the let body, the bindings stick around, making some interesting interactions possible (these are outside of the scope of this manual). You can even use setf on these variables and the new value would be stored temporarily. As soon as the last form in let body is executed its result is returned, and the variables are restored back to their original values.

 (setf x 3 y 4 z 5) => 5
 (let ((x 1) (y 2) (z 3)) (+ x y z)) => 6
 (+ x y z) => 12

Good programming practices generally recommend using local variables whenever possible, and limiting global variables only to when it is absolutely necessary. Thus, you should use let whenever possible, and treat setf like there is a tax on each use.

Control flow

The if operator was already explained before, but may seem hard to use at this point. This is because if allows only one form for each branch, which makes it hard to use in most situations. Fortunately, Lisp syntax gives you more freedom to define your blocks than C's curly braces or Pascal's begin/end. progn creates a very simple block of code, executing its arguments one by one and returning the result of the last one.

 (progn (setf x 1 y 2) (setf z (+ x y)) (* y z)) => 6

let and let* can also be used for this purpose, especially if you want some temporary variables inside the branches.

block creates a named block, from which you can return with return-from:

 (block aaa
   (return-from aaa 1) 
   (+ 1 2 3)) => 1 ;;The form (+ 1 2 3) is not evaluated.

...and other forms exist, such as the, locally, prog1, tagbody and so on. Fortunately, if you're not into writing Lisp macros, if is probably the only construct where you'll have to use blocks.

As if is quite ugly when used repeatedly, there are some convenient macros in place so you won't have to use it all that often. when evaluates its first argument and, if it's not nil, evaluates the rest of its arguments, returning the last result. unless does the same if its first argument is nil. Otherwise they both return nil.

cond is slightly more complicated, but also more useful. It tests its conditions until one of them is not nil and then evaluates the associated code. This is easier to parse and better-looking than nested if's. The syntax of cond is as follows:

 (cond (condition1 some-forms1)
       (condition2 some-forms2)
       ...and so on...)

case is similar to cond except it branches after examining the value of some expression:

 (case expression
   (values1 some-forms1) ;values is either one value 
   (values2 some-forms2) ;or a list of values.
   ...
   (t some-forms-t)) ;executed if no values match

or evaluates its arguments until one of them is not nil, returning its value, or the value of last argument.

and evaluates its arguments until one of them is nil, returning its value (nil, that is) - otherwise the value of the last argument is returned.

You may notice that or and and could be used as logical operations as well - remember that everything non-nil is true.

Loops

Iteration, or evaluating a form many times, is covered by many tools evaluating something many times. The most useful tool is loop - in its simplest form it simply executes its body until the return operator is called, in which case it returns the specified value:

 (setf x 0)
 (loop (setf x (+ x 1)) (when (> x 10) (return x))) => 11

More complicated forms of loop are better learned through examples. Depending on the type of iteration wanted, different operators, such as for and until, are used. While loop should be enough for all kind of loops, there are other constructs for those who wouldn't like to learn its full syntax.

dotimes executes some code a fixed number of times. Its syntax is:

 (dotimes (var number result) 
   forms)

dotimes increments var from 0 to number and executes forms each time with that value of var, returning result in the end.

dolist iterates through a list: its syntax is the same as of dotimes except that a list replaces number. var begins as the list's car and is moved until it reaches the list's last element.

mapcar applies a function to different sets of arguments and returns a list of results, for example:

 (mapcar #'+ '(1 3 6) '(2 4 7) '(3 5 8)) => (6 12 21)

#'+ is a shortcut for (function +). The function + is first applied to the list of arguments (1 2 3), then to (3 4 5) and then to (6 7 8).

Defining functions

Functions are defined using macro defun:

 (defun function-name (arguments)
   (body))

The created function is associated with the symbol function-name and can be called like any other function. It's worth mentioning that functions defined that way can be recursive or can call each other - this is an essential part of most Lisp programming. An example of a recursive function is:

 (defun factorial (x)
   (if (= x 0) 1
     (* x (factorial (- x 1)))))

As can be seen, this function simply describes an otherwise complicated operation by calling itself over and over. The process is stopped when x = 0, and x decreases with each call, eliminating (for positive integer x) the possibility of an infinite loop. Note that x, being an argument, takes on a different value for each call of the function. As seen in "Binding values" above, none of these values override any of the previous.

The special operator lambda creates an anonymous function, which you can use for a one-time purpose. Its syntax is the same except instead of "defun function-name" you write "lambda". These functions cannot be recursive. For the most part, lambda functions are used in forms such as mapcar (and funcall and apply below) to eliminate excessive repetition or memory use.

You can also bind functions temporarily with flet and labels. They are very similar to let and let*. The difference between them is that in labels a function can refer to itself, while in flet it refers to a former function with the same name instead.

Calling functions

To call a function f with arguments a1, a2, and a3, simply type

 (f a1 a2 a3)

Sometimes the function which should be called is stored in a variable and you don't know its name beforehand. Or maybe you don't know how many arguments are to be passed. In those cases the functions funcall and apply become handy. Like all functions they evaluate their arguments - the first one should produce a function to be called, the rest should produce arguments. Funcall just calls the function with whatever arguments supplied. Apply checks if its last argument is a list and if it is, it treats it as a list of arguments. Compare:

 (funcall #'list '(1 2) '(3 4))  => ((1 2) (3 4))
 (apply #'list '(1 2) '(3 4)) => ((1 2) 3 4)

User interaction

For this tutorial I've covered only the simple tasks of input and output. To read a value from the user use the read function. It will attempt to read an arbitrary Lisp expression from the input, and the value returned by read is that expression.

 >(read)  ;Run read function
 (+ 1 x)  ;That's what the user types
 (+ 1 X)  ;that's what is returned

In that example the returned value is a list of three elements: symbol +, number 1 and symbol X (note how it got uppercased). read is one of the three functions that comprise read-eval-print loop, the core element of Lisp. This is the same function that is used to read the expressions you type at the Lisp prompt.

While read is handy for receiving numbers and lists from the user, other data-types are expected by most users to be fed to computer in a different manner. Take strings for example. To make read recognise a string one must add double quotes around it "like this". But a normal user expects to just type like this without the quotes and press Enter. So, there is a different general-purpose function used for input: read-line. read-line returns a string that contains what user typed before pressing Enter. You can then process that string to extract the information you need.

For output there is also quite a number of functions available. One that is interesting to us is princ, which simply prints a supplied value, and also returns it. This may be confusing when using it in the Lisp console:

 >(princ "aaa")
 aaa
 "aaa"

The first aaa (without the quotes) is what is printed, while the second one is a returned value (which is printed by the print function and it is not as nice-looking). Another function that could be useful is terpri, which prints a new line to the output (the name "terpri" is historical and means "TERminate PRInt line").

For more information, return to Common Lisp.