Introducing Julia/Metaprogramming

From Wikibooks, open books for an open world
Jump to: navigation, search
« Introducing Julia
Metaprogramming
»
Plotting Modules and packages

What is metaprogramming?[edit]

Meta-programming is when you write Julia code to process and modify Julia code. With the meta-programming tools, you can write Julia code that modifies other parts of your source files, and even control if and when the modified code runs.

In Julia, the execution of raw source code takes place in two stages. (In reality there are more stages than this, but at this point we'll focus on just these two.)

Stage 1 is when your raw Julia code is parsed — converted into a form that is suitable for evaluation. You'll be familiar with this phase, because this is when all your syntax mistakes are noticed... The result of this is an abstract syntax tree or AST (Abstract Syntax Tree), a structure that contains all your code, but in a format that is easier to manipulate than the human-friendly syntax normally used.

Stage 2 is when that parsed code is executed. Usually, when you type code into the REPL and press Return, or when you run a Julia file from the command line, you don't notice the two stages, because they happen so quickly. However, with Julia's metaprogramming facilities, you can access the code after it's been parsed but before it's evaluated.

This lets you do things that you can't normally do. For example, you can convert simple expressions to more complicated expressions, or examine code before it runs and change it so that it runs faster. Any code that you intercept and modify using these meta-programming tools will eventually be evaluated in the usual way, running as fast as ordinary Julia code.

You may have already used two existing examples of meta-programming in Julia:

- the @time macro:

julia> @time [sin(cos(i)) for i in 1:100000];
elapsed time: 0.00721026 seconds (800048 bytes allocated)

The @time macro inserts a "start the stopwatch" command at the beginning of the code, before passing the expression on to be evaluated. When the code has finished running, a "finish the stopwatch" command is added, followed by the calculations to report the elapsed time and memory usage.

- the @which macro

julia> @which 2 + 2
+(x::Int64,y::Int64) at int.jl:33

This macro doesn't allow the expression 2 + 2 to be evaluated at all. Instead, it reports which method would be used for these particular arguments. And it also tells you the source file that contains the method's definition, and the line number.

Other uses for meta-programming include the automation of tedious coding jobs by writing short pieces of code that produce larger chunks of code, and the ability to improve the performance of 'standard' code by producing the sort of faster code that perhaps you wouldn't want to write by hand.

Quoted expressions[edit]

For meta-programming to work, there has to be a way to stop Julia evaluating expressions as soon as the parsing phase has finished. This is the ':' (colon) prefix operator:

julia> x = 3
3
julia> :x
:x

To Julia, the :x is an unevaluated or quoted symbol.

(If you're unfamiliar with the use of quoted symbols in computer programming, think of how quotes are sometimes used in writing to distinguish between ordinary use and special use. For example, in the sentence:

'Copper' contains six letters.

the quotes indicate that the word 'Copper' is not a reference to the metal, but to the word itself. In the same way, in :x, the colon before the symbol is to make you and Julia think of 'x' as an unevaluated symbol rather than as the value 3.)

To quote whole expressions rather than individual symbols, start with a colon and then enclose the Julia expression in parentheses:

julia> :(2 + 2)

:(2 + 2)

There's an alternative form of the :( ) construction that uses the quote ... end keywords to enclose and quote an expression:

julia> quote
           2 + 2
       end

quote  # none, line 2:
    2 + 2
end

julia> expression = quote
          for i = 1:10
              println(i)
          end
       end

quote  # none, line 3:
    for i = 1:10 # none, line 4:
        println(i)
    end
end

This object expression is of type Expr:

julia> typeof(expression)
Expr

It's parsed, primed, and ready to go.

Evaluating expressions[edit]

There's also a function for evaluating an unevaluated expression. It's called eval():

julia> eval(:x)
3
julia> eval(:(2 + 2))
4
julia> eval(expression)
1
2
3
4
5
6
7
8
9
10

With these tools, it's possible to create any expression and store it without having it evaluate:

julia> e = :(
           for i in 1:10
               println(i)
           end
       )

:(for i = 1:10 # line 2:
    println(i)
end)

and then to recall and evaluate it later:

julia> eval(e)
1
2
3
4
5
6
7
8
9
10

It's also possible to modify the contents of the expression before it's evaluated.

Inside Expressions[edit]

Once you have Julia code in an unevaluated expression, rather than as a piece of text in a string, you can do things with it.

Here's another expression:

julia> P = quote
           a = 2
           b = 3
           c = 4
           d = 5
           e = sum([a,b,c,d])
       end

quote  # none, line 2:
    a = 2 # line 3:
    b = 3 # line 4:
    c = 4 # line 5:
    d = 5 # line 6:
    e = sum([a,b,c,d])
end

Notice the helpful line numbers that have been added to each line of the quoted expression. (Don't be confused by the fact that the labels for each line are added on the end of the previous line.)

We can use the fieldnames() function to see what's inside this expression:

julia> fieldnames(P)
3-element Array{Symbol,1}:
 :head
 :args
 :typ

The head field is Block. The args field is another array, containing expressions (including comments). We can examine these with the usual Julia techniques. For example, what's the second subexpression:

julia> P.args[2]
:(a = 2)

Print them out:

julia> for (n, expr) in enumerate(P.args)
    println(n, ": ", expr)
end

1:  # none, line 2:
2: a = 2
3:  # none, line 3:
4: b = 3
5:  # none, line 4:
6: c = 4
7:  # none, line 5:
8: d = 5
9:  # none, line 6:
10: e = sum([a,b,c,d])

As you can see, the expression P contains a number of sub-expressions. We can modify this expression quite easily; for example, we can change the last line of the expression to use prod() rather than sum(), so that, when P is evaluated, it will return the product rather than the sum of the variables.

julia> eval(P)
14

julia> P.args[end] = quote prod([a,b,c,d]) end
quote  # none, line 1:
    prod([a,b,c,d])
end

julia> eval(P)
120

Alternatively, you can target the sum() symbol directly by burrowing into the expression:

julia> P.args[end].args[end].args[1]
:sum

julia> P.args[end].args[end].args[1] = :prod
:prod

julia> eval(P)
120

The Abstract Syntax Tree[edit]

This way of representing your code once it's been parsed is referred to as the AST (Abstract Syntax Tree). It's a nested hierarchical structure that's designed to allow both you and Julia to easily process and modify the code.

The very useful dump function lets you easily visualise the hierarchical nature of an expression. For example, the expression :(1 * sin(pi/2)) is represented like this:

julia> dump(:(1 * sin(pi/2)))
Expr
  head: Symbol call
  args: Array{Any}((3,))
    1: Symbol *
    2: Int64 1
    3: Expr
      head: Symbol call
      args: Array{Any}((2,))
        1: Symbol sin
        2: Expr
          head: Symbol call
          args: Array{Any}((3,))
            1: Symbol /
            2: Symbol pi
            3: Int64 2
          typ: Any
      typ: Any
  typ: Any

You can see that the AST consists entirely of Exprs and atoms (e.g. symbols, numbers).

Expression interpolation[edit]

In a way, strings and expressions are similar — any Julia code they happen to contain is usually unevaluated, but you can have some of the code evaluated using interpolation. We've met the string interpolation operator, the dollar sign ($). When used inside a string, and possibly with parentheses to enclose the expression, this evaluates the Julia code and inserts the resulting value into the string at that point:

julia> "the sine of 1 is $(sin(1))"
"the sine of 1 is 0.8414709848078965"

In just the same way, you can use the dollar sign to include the results of executing Julia code interpolated into an expression (which is otherwise unevaluated):

julia> quote s = $(sin(1) + cos(1)); end
quote  # none, line 1:
    s = 1.3817732906760363
end

Even though this is a quoted expression and hence unevaluated, the value of sin(1) + cos(1) was calculated and inserted into the expression, replacing the original code. This operation is called "splicing".

As with string interpolation, the parentheses are needed only if you want to include the value of an expression — a single symbol can be interpolated using just a single dollar sign.

Macros[edit]

Once you know how to create and handle unevaluated Julia expressions, you'll want to know how you can modify them. A macro is a way of generating a new output expression, given an unevaluated input expression. When your Julia program runs, it first parses and evaluates the macro, and the processed code produced by the macro is eventually evaluated like an ordinary expression.

Here's the definition of a simple macro that prints out the contents of the thing you pass to it, and then returns the expression to the calling environment (here, the REPL). The syntax is very similar to the way you define functions:

macro p(n)
    if typeof(n) == Expr 
       println(n.args)
    end
    return n
end

You run macros by preceding the name with the @ prefix. This macro is expecting a single argument. You're providing unevaluated Julia code, you don't have to enclose it with parentheses, like you do for function arguments.

First, let's call this with a single numeric argument:

julia> @p 3
3

Numbers aren't expressions, so the if condition inside the macro didn't apply. All the macro did was return n. But if you pass an expression, the code in the macro has the opportunity to inspect and/or process the expression's content before it is evaluated, using the .args field:

julia> @p 3 + 4 - 5 * 6 / 7 % 8
Any[:-,:(3 + 4),:(((5 * 6) / 7) % 8)]
2.7142857142857144

In this case, the if condition was triggered, and the arguments of the incoming expression were printed in unevaluated form. So you can see the arguments as an array of expressions after being parsed by Julia but before being evaluated. You can also see how the different precedence of arithmetic operators has been taken into account in the parsing operation. Notice how the top-level operators and subexpressions are quoted with a colon (:).

In this simple example, the macro p returned the argument, which was then evaluated. But it doesn't have to — it could return a quoted expression instead.

As an example, the built-in @time macro returns a quoted expression rather than using eval() to evaluate the expression inside the macro. The quoted expression returned by @time is evaluated in the calling context when the macro has done its work. Here's the definition:

macro time(ex)
    quote
        local t0 = time()
        local val = $(esc(ex))
        local t1 = time()
        println("elapsed time: ", t1-t0, " seconds")
        val
    end
end

Notice the $(esc(ex)) expression. This is the way that you 'escape' the code you want to time, which is in ex, so that it isn't evaluated in the macro, but left intact until the entire quoted expression is returned to the calling context and executed there. If this just said $ex, then the expression would be interpolated and evaluated immediately.

If you want to pass a multi-line expression to a macro, use the begin ... end form:

julia> @p begin
    2 + 2 - 3
end

Any[:( # none, line 2:),:((2 + 2) - 3)]
1

(You can also call macros with parentheses similar to the way you do when calling functions, using the parentheses to enclose the arguments:

julia> @p(2 + 3 + 4 - 5)
Any[:-,:(2 + 3 + 4),5]
4

This would allow you to define macros that accepted more than one expression as arguments.)

eval() and @eval[edit]

There's an eval() function, and an @eval macro. You might be wondering what's the difference between the two?

julia> ex = :(2 + 2)
:(2 + 2) 

julia> eval(ex)
4

julia> @eval ex
:(2 + 2)

The function version expands the expression and evaluates it. The macro version doesn't expand the expression you supply to it automatically, but you can use the interpolation syntax to evaluate the expression and pass it to the macro.

julia> @eval $(ex)
4

In other words:

julia> @eval $(ex) == eval(ex)
true

Scope and context[edit]

When you use macros, you have to keep an eye out for scoping issues. In the previous example, the $(esc(ex)) syntax was used to prevent the expression from being evaluated in the wrong context. Here's another contrived example to illustrate this point.

macro f(x)
    quote
        s = 4
        (s, $(esc(s)))
    end
end

This macro declares a variable s, and returns a quoted expression containing s and an escaped version of s.

Now, outside the macro, declare a symbol s:

julia> s = 0

Run the macro:

julia> @f 2
(4,0)

You can see that the macro returned different values for the symbol s: the first was the value inside the macro's context, 4, the second was an escaped version of s, that was evaluated in the calling context, where s has the value 0. In a sense, esc() has protected the value of s as it passes unharmed through the macro. For the more realistic @time example, it's important that the expression you want to time isn't modified in any way by the macro.

Expanding macros[edit]

To see what the macro expands to just before it's finally executed, use the macroexpand() function. It expects a quoted expression containing one or more macro calls, which are then expanded into proper Julia code for you so that you can see what the macro would do when called.

julia> macroexpand(quote @p 3 + 4 - 5 * 6 / 7 % 8 end)
Any[:-,:(3 + 4),:(((5 * 6) / 7) % 8)]
quote  # none, line 1:
    (3 + 4) - ((5 * 6) / 7) % 8
end

(The #none, line 1: is a filename and line number reference that's more useful when used inside a source file than when you're using the REPL.)

Here's another example. This macro adds a dotimes construction to the language.

macro dotimes(n, body)
    quote
        for i = 1:$(esc(n))
            $(esc(body))
        end
    end
end

This is used as follows:

@dotimes 3 println("hi there")
hi there
hi there
hi there

Or, less likely, like this:

@dotimes 3 begin
    for i in 4:6
        println("i is $i")
    end
end

i is 4
i is 5
i is 6
i is 4
i is 5
i is 6
i is 4
i is 5
i is 6

If you use macroexpand() on this, you can see what happens to the symbol names:

macroexpand( 
    quote  
        @dotimes 3 begin
            for i in 4:6
                println("i is $i")
            end
        end
    end 
)

with the following output:

quote  # none, line 1:
    begin  # none, line 3:
        for #378#i = 1:3 # line 4:
            begin  # none, line 2:
                for i = 4:6 # line 3:
                    println("i is $i")
                end
            end
        end
    end
end

The i local to the macro itself has been renamed to #378#i, so as not to clash with the original i in the code I've passed to it.

A more useful example: @until[edit]

Here's how to define a macro that is more likely to be useful in your code.

Julia doesn't have an until condition ... do some stuff ... end statement. Perhaps you'd like to type something like this:

until x > 100
    println(x)
end

You'll be able to write your code using the new until macro like this:

until ''condition''
    ''block_of_stuff''
end

but, behind the scenes, the work will be done by actual code with the following structure:

while true
    ''block_of_stuff''
    if ''condition''
        break
    end
end

This forms the body of the new macro, and it will be enclosed in a quote ... end block, like this, so that it executes when evaluated , but not before:

quote
    while true
        ''block_of_stuff''
        if ''condition''
            break
        end
    end
end

So the nearly-finished macro code is like this:

macro until(''condition'', ''block_of_stuff'')
    quote
        while true
            ''block_of_stuff''
            if ''condition''
                break
            end
        end
    end
end

All that remains to be done is to work out how to pass in our code for the block_of_stuff and the condition parts of the macro. Recall that $(esc(...)) allows code to pass through 'escaped' (i.e. unevaluated). We'll protect the condition and block code from being evaluated before the macro code runs.

The final macro definition is therefore:

macro until(condition, block)
    quote
        while true
            $(esc(block))
            if $(esc(condition))
                break
            end
        end
    end
end

The new macro is used like this:

julia> i = 0 
julia> @until i == 10 begin
           i += 1
           println(i)
       end

 1
 2
 3
 4
 5
 6
 7
 8
 9
 10

or

julia> x = 5
 5
 
 julia> @until x < 1 (println(x); x -= 1)
 5
 4
 3
 2
 1

Under the hood[edit]

If you want a more complete explanation of the compilation process than that provided here, visit the links shown in Further Reading, below.

Julia performs multiple 'passes' to transform your code to native assembly code. As described above, the first pass parses the Julia code and builds the 'surface-syntax' AST, suitable for manipulation by macros. A second pass lowers this high-level AST into an intermediate representation, which is used by type inference and code generation. In this intermediate AST format all macros have been expanded and all control flow has been converted to explicit branches and sequences of statements. At this stage the Julia compiler attempts to determine the types of all variables so that the most suitable method of a generic function (which can have many methods) is selected.

Further reading[edit]