F Sharp Programming/Arrays

From Wikibooks, open books for an open world
Jump to: navigation, search
Previous: Control Flow Index Next: Mutable Collections
F# : Arrays

Arrays are a ubiquitous, familiar data structure used to represent a group of related, ordered values. Unlike F# data structures, arrays are mutable, meaning the values in an array can be changed after the array has been created.

Creating Arrays[edit]

Arrays are conceptually similar to lists. Naturally, arrays can be created using many of the same techniques as lists:

Array literals[edit]

> [| 1; 2; 3; 4; 5 |];;
val it : int array = [|1; 2; 3; 4; 5|]

Array comprehensions[edit]

F# supports array comprehensions using ranges and generators in the same style and format as list comprehensions:

> [| 1 .. 10 |];;
val it : int array = [|1; 2; 3; 4; 5; 6; 7; 8; 9; 10|]
 
> [| 1 .. 3 .. 10 |];;
val it : int array = [|1; 4; 7; 10|]
 
> [| for a in 1 .. 5 do
    yield (a, a*a, a*a*a) |];;
val it : (int * int * int) array
= [|(1, 1, 1); (2, 4, 8); (3, 9, 27); (4, 16, 64); (5, 25, 125)|]

System.Array Methods[edit]

There are several methods in the System.Array module for creating arrays:


val zeroCreate : int arraySize -> 'T []

Creates an array with arraySize elements. Each element in the array holds the default value for the particular data type (0 for numbers, false for bools, null for reference types).
> let (x : int array) = Array.zeroCreate 5;;
val x : int array
 
> x;;
val it : int array = [|0; 0; 0; 0; 0|]

val create : int -> 'T value -> 'T []

Creates an array with arraySize elements. Initializes each element in the array with value.
> Array.create 5 "Juliet";;
val it : string [] = [|"Juliet"; "Juliet"; "Juliet"; "Juliet"; "Juliet"|]

val init : int arraySize -> (int index -> 'T) initializer -> 'T []

Creates an array with arraySize elements. Initializes each element in the array with the initializer function.
> Array.init 5 (fun index -> sprintf "index: %i" index);;
val it : string []
= [|"index: 0"; "index: 1"; "index: 2"; "index: 3"; "index: 4"|]

Working With Arrays[edit]

Elements in an array are accessed by their index, or position in an array. Array indexes always start at 0 and end at array.length - 1. For example, lets take the following array:

let names = [| "Juliet"; "Monique"; "Rachelle"; "Tara"; "Sophia" |]
(* Indexes:        0         1           2         3        4 *)

This list contains 5 items. The first index is 0, and the last index is 4.

We can access items in the list using the .[index] operator, also called indexer notation. Here is the same list in fsi:

> let names = [| "Juliet"; "Monique"; "Rachelle"; "Tara"; "Sophia" |];;
 
val names : string array
 
> names.[2];;
val it : string = "Rachelle"
 
> names.[0];;
val it : string = "Juliet"

Instances of arrays have a Length property which returns the number of elements in the array:

> names.Length;;
val it : int = 5
 
> for i = 0 to names.Length - 1 do
    printfn "%s" (names.[i]);;
Juliet
Monique
Rachelle
Tara
Sophia

Arrays are mutable data structures, meaning we can assign elements in an array new values at any point in our program:

> names;;
val it : string array = [|"Juliet"; "Monique"; "Rachelle"; "Tara"; "Sophia"|]
 
> names.[4] <- "Kristen";;
val it : unit = ()
 
> names;;
val it : string array = [|"Juliet"; "Monique"; "Rachelle"; "Tara"; "Kristen"|]

If you try to access an element outside the range of an array, you'll get an exception:

> names.[-1];;
System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at <StartupCode$FSI_0029>.$FSI_0029._main()
stopped due to error
 
> names.[5];;
System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at <StartupCode$FSI_0030>.$FSI_0030._main()
stopped due to error

Array Slicing[edit]

F# supports a few useful operators which allow programmers to return "slices" or subarrays of an array using the .[start..finish] operator, where one of the start and finish arguments may be omitted.

> let names = [|"0: Juliet"; "1: Monique"; "2: Rachelle"; "3: Tara"; "4: Sophia"|];;
 
val names : string array
 
> names.[1..3];; (* Grabs items between index 1 and 3 *)
val it : string [] = [|"1: Monique"; "2: Rachelle"; "3: Tara"|]
 
> names.[2..];; (* Grabs items between index 2 and last element *)
val it : string [] = [|"2: Rachelle"; "3: Tara"; "4: Sophia"|]
 
> names.[..3];; (* Grabs items between first element and index 3 *)
val it : string [] = [|"0: Juliet"; "1: Monique"; "2: Rachelle"; "3: Tara"|]

Note that array slices generate a new array, rather than altering the existing array. This requires allocating new memory and copying elements from our source array into our target array. If performance is a high priority, it is generally more efficient to look at parts of an array using a few index adjustments.

Multi-dimensional Arrays[edit]

A multi-dimensional array is literally an array of arrays. Conceptually, it's not any harder to work with these types of arrays than single-dimensional arrays (as shown above). Multi-dimensional arrays come in two forms: rectangular and jagged arrays.

Rectangular Arrays[edit]

A rectangular array, which may be called a grid or a matrix, is an array of arrays, where all of the inner arrays have the same length. Here is a simple 2x3 rectangular array in fsi:

> Array2D.zeroCreate<int> 2 3;;
val it : int [,] = [|[|0; 0; 0|]; [|0; 0; 0|]|]

This array has 2 rows, and each row has 3 columns. To access elements in this array, you use the .[row,col] operator:

> let grid = Array2D.init<string> 3 3 (fun row col -> sprintf "row: %i, col: %i" row col);;
 
val grid : string [,]
 
> grid;;
val it : string [,]
= [|[|"row: 0, col: 0"; "row: 0, col: 1"; "row: 0, col: 2"|];
    [|"row: 1, col: 0"; "row: 1, col: 1"; "row: 1, col: 2"|];
    [|"row: 2, col: 0"; "row: 2, col: 1"; "row: 2, col: 2"|]|]
 
> grid.[0, 1];;
val it : string = "row: 0, col: 1"
 
> grid.[1, 2];;
val it : string = "row: 1, col: 2"

Here's a simple program to demonstrate how to use and iterate through multidimensional arrays:

open System
 
let printGrid grid =
    let maxY = (Array2D.length1 grid) - 1
    let maxX = (Array2D.length2 grid) - 1
 
    for row in 0 .. maxY do
        for col in 0 .. maxX do
            if grid.[row, col] = true then Console.Write("* ")
            else Console.Write("_ ")
        Console.WriteLine()
 
let toggleGrid (grid : bool[,]) =
    Console.WriteLine()
    Console.WriteLine("Toggle grid:")
 
    let row =
        Console.Write("Row: ")
        Console.ReadLine() |> int
 
    let col =
        Console.Write("Col: ")
        Console.ReadLine() |> int
 
    grid.[row, col] <- (not grid.[row, col])
 
let main() =
    Console.WriteLine("Create a grid:")
    let rows =
        Console.Write("Rows: ")
        Console.ReadLine() |> int
 
    let cols =
        Console.Write("Cols: ")
        Console.ReadLine() |> int
 
    let grid = Array2D.zeroCreate<bool> rows cols
    printGrid grid
 
    let mutable go = true
    while go do
        toggleGrid grid
        printGrid grid
        Console.Write("Keep playing (y/n)? ")
        go <- Console.ReadLine() = "y"
 
    Console.WriteLine("Thanks for playing")
 
main()

This program outputs the following:

Create a grid:
Rows: 2
Cols: 3
_ _ _
_ _ _

Toggle grid:
Row: 0
Col: 1
_ * _
_ _ _
Keep playing (y/n)? y

Toggle grid:
Row: 1
Col: 1
_ * _
_ * _
Keep playing (y/n)? y

Toggle grid:
Row: 1
Col: 2
_ * _
_ * *
Keep playing (y/n)? n

Thanks for playing

In additional to the Array2D module, F# has an Array3D module to support 3-dimensional arrays as well.

Note It's possible to create arrays with more than 3 dimensions using the System.Array.CreateInstance method, but it's generally recommended to avoid creating arrays with huge numbers of elements or dimensions, since it can quickly consume all of the available memory on a machine. For comparison, an int is 4 bytes, and a 1000x1000x1000 int array would consume about 3.7 GB of memory, more than the memory available on 99% of desktop PCs.

Jagged arrays[edit]

A jagged array is an array of arrays, except each row in the array does not necessary need to have the same number of elements:

> [| for a in 1 .. 5 do yield [| 1 .. a |] |];;
val it : int array array
= [|[|1|];
    [|1; 2|];
    [|1; 2; 3|];
    [|1; 2; 3; 4|];
    [|1; 2; 3; 4; 5|]|]

You use the .[index] operator to access items in the array. Since each element contains another array, it's common to see code that resembles .[row].[col]:

> let jagged = [| for a in 1 .. 5 do yield [| 1 .. a |] |]
for arr in jagged do
    for col in arr do
        printf "%i " col
    printfn "";;
 
val jagged : int array array
 
1 
1 2 
1 2 3 
1 2 3 4 
1 2 3 4 5
 
> jagged.[2].[2];;
val it : int = 3
 
> jagged.[4].[0];;
val it : int = 1
Note: Notice that the data type of a rectangular array is 'a[,], but the data type of a jagged array is 'a array array. This results because a rectangular array stores data "flat", whereas a jagged array is literally an array of pointers to arrays. Since these two types of arrays are stored differently in memory, F# treats 'a[,] and 'a array array as two different, non-interchangeable data types. As a result, rectangular and jagged arrays have slightly different syntax for element access.
Rectangular arrays are stored in a slightly more efficient manner and generally perform better than jagged arrays, although there may not be a perceptible difference in most applications. However, it's worth noting performance differences between the two in benchmark tests.

Using the Array Module[edit]

There are two array modules, System.Array and Microsoft.FSharp.Collections.Array, developed by the .NET BCL designers and the creators of F# respectively. Many of the methods and functions in the F# Array module are similar to those in the List module.

val append : 'T[] first -> 'T[] second -> 'T[]

Returns a new array with elements consisting of the first array followed by the second array.

val choose : ('T item -> 'U option) -> 'T[] input -> 'U[]

Filters and maps an array, returning a new array consisting of all elements which returned Some.

val copy : 'T[] input -> 'T[]

Returns a copy of the input array.

val fill : 'T[] input -> int start -> int end -> 'T value -> unit

Assigns value to all the elements between the start and end indexes.

val filter : ('T -> bool) -> 'T[] -> 'T[]

Returns a new array consisting of items filtered out of the input array.

val fold : ('State -> 'T -> 'State) -> 'State -> 'T[] input -> 'State

Accumulates left to right over an array.

val foldBack : ('T -> 'State -> 'State) -> 'T[] input -> 'State -> 'State

Accumulates right to left over an array.

val iter : ('T -> unit) -> 'T[] input -> unit

Applies a function to all of the elements of the input array.

val length : 'T[] -> int

Returns the number of items in an array.

val map : ('T -> 'U) -> 'T[] -> 'U[]

Applies a mapping function to each element in the input array to return a new array.

val rev : 'T[] input -> 'T[]

Returns a new array with the items in reverse order.

val sort : 'T[] -> 'T[]

Sorts a copy of an array.

val sortInPlace : 'T[] -> unit

Sorts an array in place. Note that the sortInPlace method returns unit, indicating the sortInPlace mutates the original array.

val sortBy : ('T -> 'T -> int) -> 'T[] -> 'T[]

Sorts a copy of an array based on the sorting function.

val sub : 'T[] -> int start -> int end -> 'T[]

Returns a sub array based on the given start and end indexes.

Differences Between Arrays and Lists[edit]

  • Lists
    • Symbol thumbs up.svg Immutable, allows new lists to share nodes with other lists.
    • Symbol thumbs up.svg List literals.
    • Symbol thumbs up.svg Pattern matching.
    • Symbol thumbs up.svg Supports mapping and folding.
    • Symbol thumbs down.svg Linear lookup time.
    • Symbol thumbs down.svg No random access to elements, just "forward-only" traversal.
  • Arrays
    • Symbol thumbs up.svg Array literals.
    • Symbol thumbs up.svg Constant lookup time.
    • Symbol thumbs up.svg Good spacial locality of reference ensures efficient lookup time.
    • Symbol thumbs up.svg Indexes indicate the position of each element relative to others, making arrays ideal for random access.
    • Symbol thumbs up.svg Supports mapping and folding.
    • Symbol thumbs down.svg Mutability prevents arrays from sharing nodes with other elements in an array.
    • Symbol thumbs down.svg Not resizable.

Representation in Memory[edit]

Items in an array are represented in memory as adjacent values in memory. For example, lets say we create the following int array:

[| 15; 5; 21; 0; 9 |]

Represented in memory, our array resembles something like this:

Memory Location: |  100 |  104 |  108 |  112 |  116
          Value: |   15 |    5 |   21 |    0 |    9
          Index: |    0 |    1 |    2 |    3 |    4

Each int occupies 4 bytes of memory. Since our array contains 5 elements, the operating systems allocates 20 bytes of memory to hold this array (4 bytes * 5 elements = 20 bytes). The first first element in the array occupies memory 100-103, the second element occupies 104-107, and so on.

We know that each element in the array is identified by it's index or position in the array. Literally, the index is an offset: since the array starts at memory location 100, and each element in the array occupies a fixed amount of memory, the operating system can know the exact address of each element in memory using the formula:

Start memory address of element at index n = StartPosition of array + (n * length of data type)
End memory address of element at index n = Start memory address + length of data type

In laymens terms, this means we can access the nth element of any array in constant time, or in O(1) operations. This is in contrast to lists, where accessing the nth element requires O(n) operations.

With the understanding that elements in an array occupy adjacent memory locations, we can deduce two properties of arrays:

  1. Creating arrays requires programmers to specify the size of the array upfront, otherwise the operating system won't know how many adjacent memory locations to allocate.
  2. Arrays are not resizable, because memory locations before the first element or beyond the last element may hold data used by other applications. An array is only "resized" by allocating a new block of memory and copying all of the elements from the old array into the new array.


Previous: Control Flow Index Next: Mutable Collections