The Sway Reference Manual/More about Functions

We have already seen some examples of functions, some user-defined and some built-in. For example, we have used the built-in functions, such as *. In reality, * is not a function but a variable that is bound to the function that multiplies two numbers together, but it is tedious to say 'the function bound to *' so we say the more concise (but technically incorrect) phrase 'the * function'.

Predefined Functions

Sway has many predefined functions. You can see the list of built-in functions by executing the command

sway -p

at the system prompt (don't confuse the system prompt with the Sway interpreter prompt). More detail on the built-in functions is given in the last chapter. No one, however, can anticipate all possible tasks that someone might want to perform, so most programming languages allow the user to define new functions. Sway is no exception and provides for the creation of new and novel functions. Of course, to be useful, these functions should be able to call built-in functions as well as other programmer created functions.

For example, a function that determines whether a given number is odd or even is not built into Sway but can be quite useful in certain situations. Here is a definition of a function named even? which returns true if the given number is even, false otherwise:

sway> function even?(x) { return x % 2 == 0; }
FUNCTION: <function even?(x)>

sway> even?;
FUNCTION: <function even?(x)>

sway> even?(4);
SYMBOL: :true

sway> even?(5);
SYMBOL: :false

sway> even?(4 + 5);
SYMBOL: :false

We could talk for days about what's going on in these interactions with the interpreter. First, let's talk about the syntax of a function definition. Later, we'll talk about the purpose of a function definition. Finally, will talk about the mechanics of a function definition and a function call.

Function syntax

Recall that the words of a programming language include its primitives, keywords and variables. A function definition corresponds to a sentence in the language in that it is built up from the words of the language. And like human languages, the sentences must follow a certain form. This specification of the form of a sentence is known as its syntax. Computer Scientists often use a special way of describing syntax of a programming language called the Backus-Naur form (or BNF). Here is a high-level description of the syntax of a Sway function definition using BNF:

functionDefinition :
'function' variable '(' optionalParameterList ')' block

optionalParameterList : $\mathrm {E}$ | parameterList

parameterList : variable
| variable ',' parameterList

block: '{' definitionSequence statementSequence '}'

The first BNF rule says that a function definition begins with the keyword function (parts of the rule that appear verbatim appear within single quotes), followed by a variable, followed by an open parenthesis, followed by something called an optionalParameterList, followed by a close parenthesis, followed by something called a block. By reading the remaining rules, we see that one defines a function by entering the keyword function followed by the name of the function, followed by a parenthesized list of formal parameters (possibly empty, as indicated by $\mathrm {E}$ ), followed by the body of the function. The body is a brace-enclosed list of definitions, then statements (the block that follows the parameter list is often called the function body).

The parameter list is composed of zero or more variable names, separated by commas. Parameters are local variables that will be bound to the values given in the call to the function. In the particular case of even?, the variable x will be bound to the number whose eveness is to be determined. It is customary to call x a formal parameter of the function even?. In function calls, the values to be bound to the formal parameters are called arguments.

Function Objects

Let's look at the body of even?. The % operator is bound to the remainder or modulus function. The == operator is bound to the equality function and determines whether the value of the left operand expression is equal to the value of the right operand expression, yielding true or false as appropriate. The value produced by == is then immediately returned as the value of the function.

When given a function definition like that above, Sway performs a couple of tasks. The first is to create the internal form of the function, known as a function object, which holds the function's name, parameter list, and body, plus the current environment. The second task is to add the function name and the function object to the current environment as a variable-value binding. Thus the name of the function is simply a variable that happens to be bound to a function object. As noted before, we often say 'the function even?' even though we really mean 'the function bound to the variable even?'.

The value of a function definition is the function object, which has type FUNCTION (indicating a user of Sway has defined the function) and a printable value of <function definitionName formalParameters>, where definitionName is the name of the function when created and formalParameters is the parenthesized list of parameters. Although the print value of the function object only lists the original name and the parameters, the actual object contains the body and context in which it was created (known as the defining environment) as well.

We can see the actual components of a function object by passing the function object to the built-in ppObject function. The pp in ppObject stands for pretty printing which means to display in a pleasing format.

sway> ppObject(even?);
<FUNCTION 2421>:
context: <OBJECT 749>
prior: :null
filter: :null
parameters: (x)
code: { return x % 2 == 0; }
name: even?
FUNCTION: <function even?(x)>

In addition to some other fields we will learn about later, we see that the parameters, code, and name fields look as expected. The only unexpected item is that the code contains a return where it did not before. You will learn more about returns in a later chapter. For now, we'll just say that returns allow you to return a value from somewhere other than the last expression in the function body.

While we are on the subject of prettying printing, we can also look at even? with the pp function:

sway> pp(even?)
function even?(x)
{
return x % 2 == 0;
}
FUNCTION: <function even?(x)>

Calling Functions

Once a function is created, it is used by calling the function with arguments. A function is called by supplying the name of the function followed by a parenthesized, comma separated, list of expressions. The arguments are the values of those expressions and are to be bound to the formal parameters. In general, if there are n formal parameters, there should be n arguments. Furthermore, the value of the first argument is bound to the first formal parameter, the second argument is bound to the second formal parameter, and so on. Moreover, the arguments are usually evaluated before being bound to the parameters.

Once the evaluated arguments are bound to the parameters, then the body of the function is evaluated. Most times, the expressions in the body of the function will reference the parameters. If so, how does the interpreter find the values of those parameters? That question is answered in the next section.

Scope

The formal parameters of a function can be thought of as variable definitions that are only in effect when the body of the function is being evaluated. That is, those variables are only visible in the body and no where else. For that reason, parameters are considered to be local variable definitions, since they only have local effect (the function body). Any direct reference to those particular variables outside the body of the function is not allowed. Consider the following interaction with the interpreter:

sway> function square(a) { return a * a; }
FUNCTION: <function square(a)>

sway> square(4);
INTEGER: 16

sway> a;
EVALUATION ERROR: :undefinedVariable
variable a is undefined
stdin,line 3: a;

Scope refers to where a variable is visible.

In the above example, the scope of variable a is restricted to the body of the function square. Any reference to a other than in the context of square is invalid. Now consider a slightly different interaction with the interpreter:

sway> var a = 10;
INTEGER: 10

sway> var b = 1;
INTEGER: 1

sway> function almostSquare(a) { return a * a + b; }
FUNCTION: <function almostSquare(a)>

sway> almostSquare(4);
INTEGER: 17

In this dialog, two variable definitions, a and b, precede the definition of almostSquare. In addition, the variable serving as the formal parameter of almostSquare has the same name as the first variable defined in the dialog. Moreover, the body of almostSquare refers to both variables a and b. Variable a is defined twice (once as a regular variable and once as a formal parameter) while variable b is referenced but is not a formal parameter. Although it seems confusing at first, the Sway interpreter has no difficulty in figuring out what's what. From the responses of the interpreter, the b in the body must refer to the variable that was defined with an initial value of 1 (since there it is the only b). The a in the function body must refer to the formal parameter whose value was set to 4 by the call to the function (given the output of the interpreter).

When a formal parameter (or any definition local to a function) has the same name as another variable that is also in scope, the formal parameter is said to shadow the other. The term shadowed refers to the fact that the other variable is in the shadow of the formal parameter and cannot be seen. A variable is said to be in scope if it is bound in the current environment or in the environment bound to the context variable in the current environment. We can see this clearly in Sway by looking at bindings. Consider this dialog:

sway> var a = 10;
INTEGER: 10

sway> var b = 1;
INTEGER: 1

sway> function almostSquare(a) { pp(this); a * a + b; }
FUNCTION: <function almostSquare(a)>

Note that in the body of this version of almostSquare, we pretty print the current environment.

sway> almostSquare(4);
<OBJECT 2569>:
context: <OBJECT 749>
dynamicContext: <OBJECT 749>
callDepth: 1
constructor: <function almostSquare(a)>
this: <OBJECT 2569>
a: 4
INTEGER: 17

It is in this current environment that the value of a is retrieved in calculating the return value. We see that, indeed, a has a value of 4. But where is the value of b found?

When the value of a variable is needed, Sway looks in the current environment (this). If the variable is not found there, Sway looks in context. If the variable is not in context, it looks in the context of context, and so on. Let's modify almostSquare to illustrate:

sway> function almostSquare(a) { pp(this); pp(context); a * a + b; }
FUNCTION: <function almostSquare(a)>

This latest version pretty prints both its current environment and its context:

sway> almostSquare(4);
<OBJECT 2603>:
context: <OBJECT 749>
dynamicContext: <OBJECT 749>
callDepth: 1
constructor: <function almostSquare(a)>
this: <OBJECT 2603>
a: 4
<OBJECT 749>:
context: <OBJECT 18>
dynamicContext: :null
callDepth: 0
constructor: :null
this: <OBJECT 749>
almostSquare: <function almostSquare(a)>
b: 1
a: 10
SwayEnv: ["SSH_AGENT_PID=5076","SHELL=/bin/bash","TERM=x...
SwayArgs: :null
INTEGER: 17

Here we see the bindings of a, b, and almostSquare, as expected.

This shows two things:

1. that within the function body, formal parameters are found in the current environment
2. that the context of the environment active in a function call is bound to the defining environment of the function.

When a reference to a is made, a search is made of the current environment. Within the function body, the value of 4 is immediately found. When the value of b is required, it is not found in the current environment. The interpreter then searches the current environment's context (in this case <OBJECT 1616>). This object does has a binding for b.

Since a has a value of 4 and b has a value of 1, the value of 17 is returned by the function. Finally, the last interaction with the interpreter illustrates the fact that the initial binding of a was unaffected by the function call.

In general, a variable found in the current environment is considered local. A variable that is in scope but is not in the current environment is considered non-local. Alternatively, local variables are considered to reside in the local scope while non-local variables are said to reside in the non-local scope. Local and non-local scope are sometimes referred to as inner and outer scope, respectively. If a non-local variable resides in the outermost scope, it is considered a global variable and the environment holding the bindings of global variables is called the global environment. In Sway, the initial environment is the global environment. Contrary to the definition above, Sway's global environment does have has an outer scope; this outer scope holds the bindings of the built-in functions. The built-in environment cannot be modified so perhaps a better definition of the global environment is the outermost environment which can be modified by the programmer.

How did the environment <OBJECT 1845>, the environment under which the function body was executed, come into being? The evaluation of the expression almostSquare(4) triggers a number of actions. The first is the creation of a new environment that will hold the formal parameters of the function to be called (in this case, the single parameter a) bound to the values of the corresponding arguments in the call (in this case, the single argument has a value of 4). This new environment has its context variable bound to context variable found in the function object associated with the function being called (this new environment is also populated with other pre-defined variables such as constructor). The body of the function being called is then executed under this new environment. The process of creating a new environment and linking its context to another environment via its context variable is called extending an environment. To summarize, when a function call is performed, the following actions are performed:

• the arguments to the function call are evaluated under the current environment
• the function object associated with the function to be called is retrieved from the current environment
• the formal parameters are retrieved from the function object
• the defining environment is retrieved from the function object
• a new environment is extended from the defining environment
• the new environment is populated by binding the formal parameters to the evaluated arguments
• the function body is retrieved from the function object
• the function body is evaluated under the newly extended environment
• the result of this evaluation is returned as the result of the function call

There are two important concepts about function calls: the static chain and the dynamic chain. The static chain we have already seen, the current environment, the context of the current environment, the context of the context, and so on. It is this chain that is searched for the value of a referenced variable. The dynamic chain, on the other hand, is the current environment, the environment of the calling function, the environment of the caller of the calling function, and so on. Unlike most languages, the dynamic chain is available for programmers to search and manipulate. For the moment, that kind of manipulation is beyond us, so we will postpone that topic to a later date.

Returning from functions

The return value of a function is the value of the last expression evaluated when executing the function body. The return function is used to make an expression anywhere in the function be the last expression evaluated.

function test(x,y)
{
if (y == 0)
{
return(0);
}
println("good value for y!");
return (x / y);
}

In the example, if y is zero, then there is an immediate return from the function and no other expressions in the function body are evaluated. If not, a message is printed and the a quotient is returned. The final return is not really needed; just having x / y as the final expression would work as well.

To make Sway, which is a functional language, look like C and Java, a call to the return function has an alternate syntax, the parentheses enclosing the single argument can be omitted:

function test(x,y)
{
if (y == 0) { return 0; }
println("good value for y!");
x / y;
}

Footnotes

1. For variadic functions, the number of arguments may be more than the number of formal parameters.
2. It is possible to delay evaluation of arguments. See the chapter of Lazy Evaluation for more details.
3. One can sometimes get to these variables indirectly, in the case of objects.