F Sharp Programming/Sequences
From Wikibooks, the open-content textbooks collection
| F# : Sequences |
Sequences, commonly called sequence expressions, are similar to lists: both data structures are used to represent an ordered collection of values. However, unlike lists, elements in a sequence are computed as they are needed (or "lazily"), rather than computed all at once. This gives sequences some interesting properties, such as the capacity to represent infinite data structures.
[edit] Defining Sequences
Sequences are defined using the syntax:
seq { expr }
Similar to lists, sequences can be constructed using ranges and comprehensions:
> seq { 1 .. 10 };; (* seq ranges *) val it : seq<int> = seq [1; 2; 3; 4; ...] > seq { 1 .. 2 .. 10 };; (* seq ranges *) val it : seq<int> = seq [1; 3; 5; 7; ...] > seq { for a in 1 .. 10 do yield a, a*a, a*a*a };; (* seq comprehensions *) val it : seq<int * int * int> = seq [(1, 1, 1); (2, 4, 8); (3, 9, 27); (4, 16, 64); ...]
Sequences have an interesting property which sets them apart from lists: elements in the sequence are lazily evaluated, meaning that F# does not compute values in a sequence until the values are actually needed. This is in contrast to lists, where F# computes the value of all elements in a list on declaration. As a demonstration, compare the following:
> let intList = [ for a in 1 .. 10 do printfn "intList: %i" a yield a ] let intSeq = seq { for a in 1 .. 10 do printfn "intSeq: %i" a yield a };; val intList : int list val intSeq : seq<int> intList: 1 intList: 2 intList: 3 intList: 4 intList: 5 intList: 6 intList: 7 intList: 8 intList: 9 intList: 10 > Seq.nth 3 intSeq;; intSeq: 1 intSeq: 2 intSeq: 3 intSeq: 4 val it : int = 4 > Seq.nth 7 intSeq;; intSeq: 1 intSeq: 2 intSeq: 3 intSeq: 4 intSeq: 5 intSeq: 6 intSeq: 7 intSeq: 8 val it : int = 8
The list is created on declaration, but elements in the sequence are created as they are needed.
As a result, sequences are able to represent a data structure with an arbitrary number of elements:
> seq { 1I .. 1000000000000I };; val it : seq<bigint> = seq [1I; 2I; 3I; 4I; ...]
The sequence above represents a list with one trillion elements in it. That does not mean the sequence actually contains one trillion elements, but it can potentially hold one trillion elements. By comparison, it would not be possible to create a list [ 1I .. 1000000000000I ] since the .NET runtime would attempt to create all one trillion elements up front, which would certainly consume all of the available memory on a system before the operation completed.
Additionally, sequences can represent an infinite number of elements:
> let allEvens = let rec loop x = seq { yield x; yield! loop (x + 2) } loop 0;; > for a in (Seq.take 5 allEvens) do printfn "%i" a;; 0 2 4 6 8
Notice the definition of allEvens does not terminate. The function Seq.take returns the first n elements of elements of the sequence. If we attempted to loop through all of the elements, fsi would print indefinitely.
- Note: sequences are implemented as state machines by the F# compiler. In reality, they manage state interally and hold only the last generated item in memory at a time. Memory usage is constant for creating and traversing sequences of any length.
[edit] Iterating Through Sequences Manually
The .NET BCL contains two interfaces in the System.Collections.Generic namespace:
type IEnumerable<'a> = interface (* Returns an enumerator that iterates through a collection *) member GetEnumerator<'a> : unit -> IEnumerator<'a> end type IEnumerator<'a> = interface (* Advances to the next element in the sequences. Returns true if the enumerator was successfully advanced to the next element; false if the enumerator has passed the end of the collection. *) member MoveNext : unit -> bool (* Gets the current element in the collection. *) member Current : 'a (* Sets the enumerator to its initial position, which is before the first element in the collection. *) member Reset : unit -> unit end
The seq type is defined as follows:
type seq<'a> = System.Collections.Generic.IEnumerable<'a>
As you can see, seq is not a unique F# type, but rather another name for the built-in System.Collections.Generic.IEnumerable interface. Since seq/IEnumerable is a native .NET type, it was designed to be used in a more imperative style, which can be demonstrated as follows:
open System open System.Collections let evens = seq { 0 .. 2 .. 10 } (* returns IEnumerable<int> *) let main() = let evensEnumerator = evens.GetEnumerator() (* returns IEnumerator<int> *) while evensEnumerator.MoveNext() do printfn "evensEnumerator.Current: %i" evensEnumerator.Current Console.ReadKey(true) |> ignore main()
This program outputs:
evensEnumerator.Current: 0 evensEnumerator.Current: 2 evensEnumerator.Current: 4 evensEnumerator.Current: 6 evensEnumerator.Current: 8 evensEnumerator.Current: 10
Behind the scenes, .NET converts every for loop over a collection into an explicit while loop. In other words, the following two pieces of code compile down to the same bytecode:
let x = [1 .. 10] for num in x do printfn "%i" num |
let x = [1 .. 10] let enumerator = x.GetEnumerator() while enumerator.MoveNext() do let num = enumerator.Current printfn "%i" num |
All collections which can be used with the for keyword implement the IEnumerable<'a> interface, a concept which will be discussed later in this book.
[edit] The Seq Module
Similar to the List modules, the Seq module contains a number of useful functions for operating on sequences:
val append : seq<'T> -> seq<'T> -> seq<'T>
- Appends one sequence onto another sequence.
val choose : ('T -> 'U option) -> seq<'T> -> seq<'U>
- Filters and maps a sequence to another sequence.
val distinct : seq<'T> -> seq<'T>
- Returns a sequence that filters out duplicate entries.
val exists : ('T -> bool) -> seq<'T> -> bool
- Determines if an element exist in a sequence.
val filter : ('T -> bool) -> seq<'T> -> seq<'T>
- Builds a new sequence consisting of elements filtered from the input sequence.
val fold : ('State -> 'T -> 'State) -> 'State -> seq<'T> -> 'State
- Repeatedly applies a function to each element in the sequence from left to right.
- Note: sequences can only be read forward-only manner, so there is no corresponding
foldBackfunction as found in the List and Array modules.
val initFinite : int elements -> (int index -> 'T) -> seq<'T>
- Generates a sequence consisting of a finite number of elements.
val initInfinite : (int -> 'T) -> seq<'T>
- Generates a sequence consisting of an infinte number of elements.
val map : ('T -> 'U) -> seq<'T> -> seq<'U>
- Maps a sequence of type
'ato type'b.
val nth : int -> seq<'T> -> 'T
- Returns the nth value of a sequence.
val take : int -> seq<'T> -> seq<'T>
- Returns a new sequence consisting of the first n elements of the input sequence.
val takeWhile : ('T -> bool) -> seq<'T> -> seq<'T>
- Return a sequence that, when iterated, yields elements of the underlying sequence while the given predicate returns
true, and returns no further elements.
val unfold : ('State -> ('T * 'State) option) -> 'State seed -> seq<'T>
- The opposite of
fold: this function generates a sequence as long as the generator function returnsSome.
> let fibs = (0I, 1I) |> Seq.unfold (fun (a, b) -> Some(a, (b, a + b) ) );; val fibs : seq<bigint> > Seq.iter (fun x -> printf "%O " x) (Seq.take 20 fibs);; 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181
The generator function in unfold expects a return type of ('T * 'State) option. The first value of the tuple is inserted as an element into the sequence, the second value of the tuple is passed as the accumulator. The fibs function is clever for its brevity, but its hard to understand if you've never seen an unfold function. The following demonstrates unfold in a more straightforward way:
> let test = 1 |> Seq.unfold (fun x -> if x <= 5 then Some(sprintf "x: %i" x, x + 1) else None );; val test : seq<string> > Seq.iter (fun x -> printfn "%s" x) test;; x: 1 x: 2 x: 3 x: 4 x: 5
Often, its preferable to generate sequences using seq comprehensions rather than the unfold.