# F Sharp Programming/Caching

F# : Caching |

Caching is often useful to re-use data which has already been computed. F# provides a number of built-in techniques to cache data for future use.

## Partial Functions[edit]

F# automatically caches the value of any function which takes no parameters. When F# comes across a function with no parameters, F# will only evaluate the function once and reuse its value everytime the function is accessed. Compare the following:

let isNebraskaCity_bad city = let cities = printfn "Creating cities Set" ["Bellevue"; "Omaha"; "Lincoln"; "Papillion"] |> Set.ofList cities.Contains(city) let isNebraskaCity_good = let cities = printfn "Creating cities Set" ["Bellevue"; "Omaha"; "Lincoln"; "Papillion"] |> Set.ofList fun city -> cities.Contains(city)

Both functions accept and return the same values, but they have very different behavior. Here's a comparison of the output in fsi:

isNebraskaCity_bad | isNebraskaCity_good |
---|---|

> let isNebraskaCity_bad city = let cities = printfn "Creating cities Set" ["Bellevue"; "Omaha"; "Lincoln"; "Papillion"] |> Set.of_list cities.Contains(city);; val isNebraskaCity_bad : string -> bool > isNebraskaCity_bad "Lincoln";; Creating cities Set val it : bool = true > isNebraskaCity_bad "Washington";; Creating cities Set val it : bool = false > isNebraskaCity_bad "Bellevue";; Creating cities Set val it : bool = true > isNebraskaCity_bad "St. Paul";; Creating cities Set val it : bool = false |
> let isNebraskaCity_good = let cities = printfn "Creating cities Set" ["Bellevue"; "Omaha"; "Lincoln"; "Papillion"] |> Set.of_list fun city -> cities.Contains(city);; Creating cities Set val isNebraskaCity_good : (string -> bool) > isNebraskaCity_good "Lincoln";; val it : bool = true > isNebraskaCity_good "Washington";; val it : bool = false > isNebraskaCity_good "Bellevue";; val it : bool = true > isNebraskaCity_good "St. Paul";; val it : bool = false |

The implementation of `isNebraskaCity_bad`

forces F# to re-create the internal set on each call. On the other hand, `isNebraskaCity_good`

is a value initialized to the function `fun city -> cities.Contains(city)`

, so it creates its internal set once and reuses it for all successive calls.

**Note:**Internally,`isNebraskaCity_bad`

is compiled as a static function which constructs a set on every call.`isNebraskaCity_good`

is compiled as a static readonly property, where the value is initialized in a static constructor.

This distinction is often subtle, but it can have a huge impact on an application's performance.

## Memoization[edit]

"Memoization" is a fancy word meaning that computed values are stored in a lookup table rather than recomputed on every successive call. Long-running pure functions (i.e. functions which have no side-effects) are good candidates for memoization. Consider the recursive definition for computing Fibonacci numbers:

> #time;; --> Timing now on > let rec fib n = if n = 0I then 0I elif n = 1I then 1I else fib (n - 1I) + fib(n - 2I);; Real: 00:00:00.000, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0 val fib : Math.bigint -> Math.bigint > fib 35I;; Real: 00:00:23.557, CPU: 00:00:23.515, GC gen0: 2877, gen1: 3, gen2: 0 val it : Math.bigint = 9227465I

Beyond `fib 35I`

, the runtime of the function becomes unbearable. Each recursive call to the `fib`

function throws away all of its intermediate calculations to `fib(n - 1I)`

and `fib(n - 2I)`

, giving it a runtime complexity of about O(2^n). What if we kept all of those intermediate calculations around in a lookup table? Here's the memoized version of the `fib`

function:

> #time;; --> Timing now on > let rec fib = let dict = new System.Collections.Generic.Dictionary<_,_>() fun n -> match dict.TryGetValue(n) with | true, v -> v | false, _ -> let temp = if n = 0I then 0I elif n = 1I then 1I else fib (n - 1I) + fib(n - 2I) dict.Add(n, temp) temp;; val fib : (Math.bigint -> Math.bigint) > fib 35I;; Real: 00:00:00.000, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0 val it : Math.bigint = 9227465I

Much better! This version of the `fib`

function runs almost instaneously. In fact, since we only calculate the value of any `fib(n)`

precisely once, and dictionary lookups are an `O(1)`

operation, this fib function runs in O(n) time.

Notice all of the memoization logic is contained in the `fib`

function. We can write a more general function to memoize any function:

let memoize f = let dict = new System.Collections.Generic.Dictionary<_,_>() fun n -> match dict.TryGetValue(n) with | (true, v) -> v | _ -> let temp = f(n) dict.Add(n, temp) temp let rec fib = memoize(fun n -> if n = 0I then 0I elif n = 1I then 1I else fib (n - 1I) + fib (n - 2I) )

**Note:**Its very important to remember that the implementation above is not thread-safe -- the dictionary should be locked before adding/retrieving items if it will be accessed by multiple threads.

- Additionally, although dictionary lookups occur in constant time, the hash function used by the dictionary can take an arbitrarily long time to execute (this is especially true with strings, where the time it takes to hash a string is proportional to its length). For this reason, it is wholly possible for a memoized function to have less performance than an unmemoized function. Always profile code to determine whether optimization is necessary and whether memoization genuinely improves performance.

## Lazy Values[edit]

The F# `lazy`

data type is an interesting primitive which delays evaluation of a value until the value is actually needed. Once computed, lazy values are cached for reuse later:

> let x = lazy(printfn "I'm lazy"; 5 + 5);; val x : Lazy<int> = <unevaluated> > x.Force();; (* Should print "I'm lazy" *) I'm lazy val it : int = 10 > x.Force();; (* Value already computed, should not print "I'm lazy" again *) val it : int = 10

F# uses some compiler magic to avoid evaluating the expression `(printfn "I'm lazy"; 5 + 5)`

on declaration. Lazy values are probably the simplest form of caching, however they can be used to create some interesting and sophisticated data structures. For example, two F# data structures are implemented on top of lazy values, namely the F# Lazy List and `Seq.cache`

method.

Lazy lists and cached sequences represent arbitrary sequences of potentially infinite numbers of elements. The elements are computed and cached the first time they are accessed, but will not be recomputed when the sequence is enumerated again. Here's a demonstration in fsi:

> let x = seq { for a in 1 .. 10 -> printfn "Got %i" a; a } |> Seq.cache;; val x : seq<int> > let y = Seq.take 5 x;; val y : seq<int> > Seq.reduce (+) y;; Got 1 Got 2 Got 3 Got 4 Got 5 val it : int = 15 > Seq.reduce (+) y;; (* Should not recompute values *) val it : int = 15 > Seq.reduce (+) x;; (* Values 1 through 5 already computed, should only compute 6 through 10 *) Got 6 Got 7 Got 8 Got 9 Got 10 val it : int = 55