Introducing Julia/Types

From Wikibooks, open books for an open world
Jump to navigation Jump to search
« Introducing Julia
Types
»
Arrays and tuples Controlling the flow

Types[edit | edit source]

This section, on types, and the next section, on functions and methods, should ideally be read at the same time, because the two topics are so closely connected.

Types of type[edit | edit source]

Data elements come in different shapes and sizes, which are called types.

Consider the following numeric values: a floating point number, a rational number, and an integer:

0.5  1//2  1

It's easy for us humans to add these numbers without much thought, but a computer won't be able to use a simple addition routine to add all three values, because the types are different. Code for adding rational numbers has to consider numerators and denominators, whereas code for adding integers won't. The computer will probably have to convert two of these values to be the same type as the third—typically the integer and the rational will first be converted to floating-point—then the three floating-point numbers will be added together.

This business of converting types obviously takes time. So, to write really fast code, you want to make sure that you don't make the computer waste time by continually converting values from one type to another. When Julia compiles your source code (which happens every time you evaluate a function for the first time), any type indications you've provided allow the compiler to produce more efficient executable code.

Another issue with converting types is that in some cases you'll be losing precision—converting a rational number to a floating-point number is likely to lose some precision.

The official word from the designers of Julia is that types are optional. In other words, if you don't want to worry about types (and if you don't mind your code running slower than it might), then you can ignore them. But you'll encounter them in error messages and the documentation, so you will eventually have to tackle them…

A compromise is to write your top-level code without worrying about types, but, when you want to speed up your code, find out the bottlenecks where your program spends the most time, and clean up the types in that area.

The type system[edit | edit source]

There's a lot to know about Julia's type system, so the official documentation is really the place to go. But here's a brief overview.

Type hierarchy[edit | edit source]

In Julia types are organized in a hierarchy, with a tree structure.

At the tree's root, we have a special type called Any, and all other types are connected to it directly or indirectly. Informally, we can say that the type Any has children. Its children are called Any's subtypes. And a child's supertype is Any. (Note, however, hierarchical relationships between types are explicitly declared, rather than implied by compatible structure.)

We can see a good example of Julia's type hierarchy by looking at the Number types.

type hierarchy for julia numbers

The type Number is a direct child of Any. To see what Number's supertype is, we can use the supertype() function:

julia> supertype(Number)
 Any

But we could also try to find Number's subtypes (Number's children, therefore Any's grandchildren). To do this, we can use the function subtypes():

julia> subtypes(Number)
2-element Array{Union{DataType, UnionAll},1}:
 Complex
 Real   

We can observe that we have two subtypes of Number: Complex and Real. For mathematicians, Real and Complex numbers are both, indeed, numbers. As a general rule, Julia's type hierarchy reflect the real world's hierarchy.

As another example, if both Jaguar and Lion were Julia types, it would natural if their supertype were Feline. We would have:

julia> abstract type Feline end
julia> mutable struct Jaguar <: Feline end
julia> mutable struct Lion <: Feline end
julia> subtypes(Feline)
2-element Array{Any,1}:
 Jaguar
 Lion  

Concrete and abstract types[edit | edit source]

Each object in Julia (informally, this means everything you can put into a variable in Julia) has a type. But not all types can have a respective object (instances of that type). The only ones that can have instances are called concrete types. These types cannot have any subtypes. The types that can have subtypes (e.g. Any, Number) are called abstract types. Therefore we cannot have a object of type Number, since it's an abstract type. In other words, only the leaves of the type tree are concrete types and can be instantiated.

If we can't create objects of abstract types, why are they useful? With them, we can write code that generalizes for any of its subtypes. For instance, suppose we write a function that expects a variable of the type Number:

 #this function gets a number, and returns the same number plus one
 function plus_one(n::Number)
     return n + 1
 end

In this example, the function expects a variable n. The type of n must be subtype of Number (directly or indirectly) as indicated with the :: syntax (but don't worry about the syntax yet). What does this mean? No matter if n's type is Int (Integer number) or Float64 (floating-point number), the function plus_one() will work correctly. Furthermore, plus_one() will not work with any types that are not subtypes of Number (e.g. text strings, arrays).

We can divide concrete types into two categories: primitive (or basic), and complex (or composite). Primitive types are the building blocks, usually hardcoded into Julia's heart, whereas composite types group many other types to represent higher-level data structures.

You'll probably see the following primitive types:

  • the basic integer and float types (signed and unsigned): Int8, UInt8, Int16, UInt16, Int32, UInt32, Int64, UInt64, Int128, UInt128, Float16, Float32, and Float64
  • more advanced numeric types: BigFloat, BigInt
  • Boolean and character types: Bool and Char
  • Text string types: String

A simple example of a composite type is Rational, used to represent fractions. It is composed of two pieces, a numerator and a denominator, both integers (of type Int).

Investigating types[edit | edit source]

Julia provides two functions for navigating the type hierarchy: subtypes() and supertype().

julia> subtypes(Integer)
4-element Array{Union{DataType, UnionAll},1}:
 BigInt  
 Bool    
 Signed  
 Unsigned

julia> supertype(Float64)
AbstractFloat

The sizeof() function tells you how many bytes an item of this type occupies:

julia> sizeof(BigFloat)
 32

julia> sizeof(Char)
 4

If you want to know how big a number you can fit into a particular type, these two functions are useful:

julia> typemax(Int64)
 9223372036854775807

julia> typemin(Int32)
 -2147483648

There are over 340 types in the base Julia system. You can investigate the type hierarchy with the following function:

 function showtypetree(T, level=0)
     println("\t" ^ level, T)
     for t in subtypes(T)
         showtypetree(t, level+1)
     end
 end
 
 showtypetree(Number)

It produces something like this for the different Number types:

julia> showtypetree(Number)
Number
	Complex
	Real
		AbstractFloat
			BigFloat
			Float16
			Float32
			Float64
		Integer
			BigInt
			Bool
			Signed
				Int128
				Int16
				Int32
				Int64
				Int8
			Unsigned
				UInt128
				UInt16
				UInt32
				UInt64
				UInt8
		Irrational
		Rational

This shows, for example, the four main subtypes of Real number: AbstractFloat, Integer, Rational, and Irrational, as seen in the tree diagram.

type hierarchy for julia numbers

Specifying the type of variables[edit | edit source]

We've already seen that Julia does its best to work out the types of things you put in your code, if you don't specify them:

julia> collect(1:10)
10-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10

julia> collect(1.0:10)
10-element Array{Float64,1}:
 1.0
 2.0
 3.0
 4.0
 5.0
 6.0
 7.0
 8.0
 9.0
10.0

And we've also seen that you can specify the type for a new empty array:

julia> fill!(Array{String}(undef, 3), "Julia")
3-element Array{String,1}:
 "Julia"
 "Julia"
 "Julia"

For variables, you can specify the type that its value must have. For technical reasons, you can't do this at the top level, in the REPL—you can only do it inside a definition. The syntax uses the :: syntax, which means "is of type". So:

function f(x::Int64)

means that the function f has a method that accepts an argument x which is expected to be an Int64. See Functions.

Type stability[edit | edit source]

Here's an example of how the performance of Julia code is affected by the choice of types for variables. This is some code for exploring the Collatz conjecture.

function chain_length(n, terms)
    length = 0
    while n != 1
        if haskey(terms, n)
            length += terms[n]
            break
        end
        if n % 2 == 0      # is n even?
            n /= 2
        else
            n = 3n + 1
        end
        length += 1
    end
    return length
end

function main()
    ans = 0
    limit = 1_000_000
    score = 0
    terms = Dict()         # define a dictionary
    for i in 1:limit
        terms[i] = chain_length(i, terms)
        if terms[i] > score
            score = terms[i]
            ans = i
        end
    end
    return ans
end

We can time this, using the @time macro (although better benchmarking tools are available with the BenchmarkTools package).

julia> @time main() 
 2.634295 seconds (17.95 M allocations: 339.074 MiB, 13.50% gc time)

There are two lines of code which prevent the functions from being "type stable". These are places where the compiler is unable to use the best and most efficient types for the task in hand. Can you spot them?

The first is the division of n by 2, after testing whether n is even. n starts out as an integer, but the / division operator always returns a floating-point value. The Julia compiler can't produce pure integer code or pure floating-point code, and has to decide which to use at each stage. As a result, the compiled code isn't as fast or as concise as it could be.

The second problem is the definition of the dictionary here. It's defined without type information, so both the keys and the values can be literally any type. While this is often OK, in this sort of task, where frequent accesses occur within loops, the additional tasks of maintaining the possibility of there being different types of keys and values makes the code more complex.

julia> Dict()
Dict{Any, Any}()

If we tell the Julia compiler that this dictionary is only to contain integers (which is a good assumption here), the compiled code will be much more efficient, and type stable.

So, after changing n /= 2 to n ÷= 2, and terms = Dict() to terms = Dict{Int, Int}(), we would expect the compiler to make much more efficient code, and indeed it's much faster:

Julia> @time main()
0.450561 seconds (54 allocations: 65.170 MiB, 19.33% gc time)

You can get some tips from the compiler about possible issues in your code due to type instability. For this function, for example, you could enter @code_warntype main() and look for items or "Any" highlighted in red.

Creating types[edit | edit source]

In Julia, it's very easy for the programmer to create new types, benefiting from the same performance and language-wise integration that the native types (those made by Julia's creators) have.

Abstract types[edit | edit source]

Suppose we want to create an abstract type. To do this, we use Julia's keyword abstract followed by the name of the type you want to create:

abstract type MyAbstractType end

By default, the type you create is a direct subtype of Any:

julia> supertype(MyAbstractType)
 Any

You can change this using the <: operator. If you want your new abstract type to be a subtype of Number, for example, you can declare:

abstract type MyAbstractType2 <: Number end

Now, we get:

julia> supertype(MyAbstractType2)
 Number

Notice that in the same Julia session (without exiting the REPL or ending the script) it's impossible to redefine a type. That's why we had to create a type called MyAbstractType2.

Concrete types and composite[edit | edit source]

You can create new composite types. To do this, use the struct or mutable structkeyword, which have the same syntax as declaring the supertype. The new type can contain multiple fields, where the object stores values. As an example, let's define a concrete type that is a subtype of MyAbstractType:

 mutable struct MyType <: MyAbstractType
    foo
    bar::Int
 end

We just created a composite struct type called MyType, a subtype of MyAbstractType, with two fields: foo that can be of any type, and bar, that is of type Int.

How do we create an object of MyType? By default, Julia automatically creates a constructor, a function that returns an object of that type. The function has the same name of the type, and each argument of the function correspond to each field. In this example, we can create a new object by typing:

julia> x = MyType("Hello World!", 10)
 MyType("Hello World!", 10)

This creates a MyType object, assigning Hello World! to the foo field and 10 to the bar field. We can access x's fields by using the dot notation:

julia> x.foo
 "Hello World!"

julia> x.bar
 10

Also, we can change the field values of mutable structs easily:

julia> x.foo = 3.0
 3.0

julia> x.foo
 3.0

Notice that, since we didn't specify foo's type when we created the type definition, we can change its type at any time. This is different when we try to change the type of the x.bar field (which we specified as being an Int according to MyType's definition):

julia> x.bar = "Hello World!"
LoadError: MethodError: Cannot `convert` an object of type String to an object of type Int64
This may have arisen from a call to the constructor Int64(...),
since type constructors fall back to convert methods.

The error message tells us that Julia couldn't change x.bar's type. This ensures type-stable code, and can provide better performance when programming. As a performance tip, specifying a field's type when defining your types is usually good practice.

The default constructor is used for simple cases, where you type something like typename(field1, field2) to produce a new instance of the type. But sometimes you want to do more when you construct a new instance, such as checking the incoming values. For this you can use an inner constructor, a function inside the type definition. The next section shows a practical example.

Example: British currency[edit | edit source]

Here's an example of how you can create a simple composite type that can handle the old-fashioned British currency. Before Britain saw the light and introduced a decimal currency, the monetary system used pounds, shillings, and pence, where a pound consisted of 20 shillings, and a shilling consisted of 12 pence. This was called the £sd or LSD system (Latin for Librae, Solidii, Denarii, because the system originated in the Roman empire).

To define a suitable type, start a new composite type declaration:

 struct LSD

To contain a price in pounds, shillings, and pence, this new type should contain three fields: pounds, shillings, and pence:

   pounds::Int 
   shillings::Int
   pence::Int

The important task is to create a constructor function. This has the same name as the type, and accepts three values as arguments. After a few checks for invalid values, the special new() function creates a new object with the passed-in values. Remember we're still inside the type definition—this is an inner constructor.

  function LSD(a,b,c)
    if a < 0 || b < 0 || c < 0
      error("no negative numbers")
    end
    if c > 12 || b > 20
      error("too many pence or shillings")
    end
    new(a, b, c) 
  end

Now we can finish the type definition:

end

Here's the complete type definition again:

struct LSD
   pounds::Int 
   shillings::Int
   pence::Int
   
   function LSD(a, b, c)
    if a < 0 || b < 0 
      error("no negative numbers")
    end
    if c > 12 || b > 20
      error("too many pence or shillings")
    end
    new(a, b, c) 
   end   
end

It's now possible to create new objects that store old-fashioned British prices. You create a new object of this type by using its name (which calls the constructor function):

julia> price1 = LSD(5, 10, 6)
LSD(5, 10, 6)

julia> price2 = LSD(1, 6, 8)
LSD(1, 6, 8)

And you can't create bad prices, because of the simple checks added to the constructor function:

julia> price = LSD(1, 0, 13)
ERROR: too many pence or shillings
Stacktrace:
[1] LSD(::Int64, ::Int64, ::Int64)

If you inspect the fields of one of the price 'objects' we've created:

julia> fieldnames(typeof(price1))
3-element Array{Symbol,1}:
 :pounds   
 :shillings
 :pence    

you can see the three fields, and these are storing the values:

julia> price1.pounds
5
julia> price1.shillings
10
julia> price1.pence
6

The next task is to make this new type behave in the same way as other Julia objects. For example, we can't add two prices:

julia> price1 + price2
ERROR: MethodError: no method matching +(::LSD, ::LSD)
Closest candidates are:
  +(::Any, ::Any, ::Any, ::Any...) at operators.jl:420

and the output could definitely be improved:

julia> price2
LSD(5, 10, 6)

Julia already has the addition function (+) with methods defined for many types of object. The following code adds yet another method that can handle two LSD objects:

function Base.:+(a::LSD, b::LSD)
    newpence = a.pence + b.pence
    newshillings = a.shillings + b.shillings
    newpounds = a.pounds + b.pounds
    subtotal = newpence + newshillings * 12 + newpounds * 240
    (pounds, balance) = divrem(subtotal, 240)
    (shillings, pence) = divrem(balance, 12)
    LSD(pounds, shillings, pence)
end

This definition teaches Julia how to handle the new LSD objects, and adds a new method to the + function, one that accepts two LSD objects, adds them together, and produces a new LSD object containing the sum.

Now you can add two prices:

julia> price1 + price2
LSD(6,17,2)

which is indeed the result of adding LSD(5,10,6) and LSD(1,6,8).

The next problem to address is the unattractive presentation of LSD objects. This is fixed in exactly the same way, by adding a new method, but this time to the show() function, which belongs to the Base environment:

function Base.show(io::IO, money::LSD)
    print(io, $(money.pounds).$(money.shillings)s.$(money.pence)d")
end

Here, the io is the output channel currently used by all show() methods. We've added a simple expression that displays the field values with appropriate punctuation and separators.

julia> println(price1 + price2)
£6.17s.2d
julia> show(price1 + price2 + LSD(0,19,11) + LSD(19,19,6))
£27.16s.7d

You can add one or more aliases, which are alternative names for a particular type. Since Price is a better way of saying LSD, we'll create an valid alternative:

julia> const Price=LSD 
LSD

julia> show(Price(1, 19, 11))
£1.19s.11d

So far, so good, but these LSD objects are still not yet fully developed. If you want to do subtraction, multiplication, and division, you have to define additional methods for these functions for handling LSDs. Subtraction is easy enough, just requiring some fiddling with shillings and pence, so we'll leave that for now, but what about multiplication? Multiplying a price by a number involves two types of object, one a Price/LSD object, the other - well, any positive real number should be possible:

function Base.:*(a::LSD, b::Real)
    if b < 0
        error("Cannot multiply by a negative number")
    end

    totalpence = b * (a.pence + a.shillings * 12 + a.pounds * 240)
    (pounds, balance) = divrem(totalpence, 240)
    (shillings, pence) = divrem(balance, 12)
    LSD(pounds, shillings, pence)
end

Like the + method we added to Base's + function, this new * method for Base's * function is defined specifically to multiply a price by a number. It works surprisingly well for a first attempt:

julia> price1 * 2
£11.1s.0d
julia> price1 * 3
£16.11s.6d
julia> price1 * 10
£55.5s.0d
julia> price1 * 1.5
£8.5s.9d
julia> price3 = Price(0,6,5)
£0.6s.5d
julia> price3 * 1//7
£0.0s.11d

However, some failures are to be expected. We didn't allow for the really old-fashioned fractions of a penny: the halfpenny and the farthing:

julia> price1 * 0.25
ERROR: InexactError()
Stacktrace:
 [1] convert(::Type{Int64}, ::Float64) at ./float.jl:675
 [2] LSD(::Float64, ::Float64, ::Float64) at ./REPL[36]:40
 [3] *(::LSD, ::Float64) at ./REPL[55]:10

(The answer should be £1.7s.7½d. Unfortunately our LSD type doesn't allow fractions of a penny.)

But there's another, more pressing, problem. At the moment you have to give the price followed by the multiplier; the other way round fails:

julia> 2 * price1
ERROR: MethodError: no method matching *(::Int64, ::LSD)
Closest candidates are:
 *(::Any, ::Any, ::Any, ::Any...) at operators.jl:420
 *(::Number, ::Bool) at bool.jl:106
...

This is because, although Julia can find a method that matches (a::LSD, b::Number), it can't find one the other way round: (a::Number, b::LSD). But adding it is very easy:

function Base.:*(a::Number, b::LSD)
  b * a
end

which adds yet another method to Base's * function.

julia> price1 * 2
£11.1s.0d
julia> 2 * price1 
£11.1s.0d
julia> for i in 1:10
          println(price1 * i)
       end
£5.10s.6d
£11.1s.0d
£16.11s.6d
£22.2s.0d
£27.12s.6d
£33.3s.0d
£38.13s.6d
£44.4s.0d
£49.14s.6d
£55.5s.0d

The prices are now looking like an old British shop from the 19th century, forsooth!

If you want to see how many methods you've added for working with this old British pounds type so far, use the methodswith() function:

julia> methodswith(LSD)
4-element Array{Method,1}:
*(a::LSD, b::Real) at In[20]:4
*(a::Number, b::LSD) at In[34]:2
+(a::LSD, b::LSD) at In[13]:2
show(io::IO, money::LSD) at In[15]:2

Just four so far.... And you can continue to add methods to make the type more generally useful—it would depend on how you envisage yourself or others using it. For example, you probably want to add division and modulo methods, and to act intelligently about negative monetary values.

Mutable structs[edit | edit source]

This composite type for holding British prices was defined as an immutable type. You can't change the values of these price objects once you've created them:

julia> price1.pence
6

julia> price1.pence=10
ERROR: type LSD is immutable

To create a new price based on an existing one, you'd have to do this:

julia> price2 = Price(price1.pounds, price1.shillings, 10)
£5.10s.10d

For this particular example this isn't a big problem, but there are many applications when you might want to modify or update the value of a field in a type, rather than create a new one with the right values.

For these cases, you'd want to create a mutable struct. Choose between struct and mutable struct depending on the requirements made on the type.

For more about modules, and importing functions from other modules, see Modules and packages.