F Sharp Programming/Units of Measure

From Wikibooks, open books for an open world
Jump to navigation Jump to search
Previous: Modules and Namespaces Index Next: Caching
F# : Units of Measure

Units of measure allow programmers to annotate floats and integers with statically-typed unit metadata. This can be handy when writing programs which manipulate floats and integers representing specific units of measure, such as kilograms, pounds, meters, newtons, pascals, etc. F# will verify that units are used in places where the programmer intended. For example, the F# compiler will throw an error if a float<m/s> is used where it expects a float<kg>.

Use Cases[edit | edit source]

Statically Checked Type Conversions[edit | edit source]

Units of measure are invaluable to programmers who work in scientific research, they add an extra layer of protection to guard against conversion related errors. To cite a famous case study, NASA's $125 million Mars Climate Orbiter project ended in failure when the orbiter dipped 90 km closer to Mars than originally intended, causing it to tear apart and disintegrate spectacularly in the Mars atmosphere. A post mortem analysis narrowed down the root cause of the problem to a conversion error in the orbiter's propulsion systems used to lower the spacecraft into orbit: NASA passed data to the systems in metric units, but the software expected data in Imperial units. Although there were many contributing project-management errors which resulted in the failed mission, this software bug in particular could have been prevented if the software engineers had used a type-system powerful enough to detect unit-related errors.

Decorating Data With Contextual Information[edit | edit source]

In an article Making Code Look Wrong, Joel Spolsky describes a scenario in which, during the design of Microsoft Word and Excel, programmers at Microsoft were required to track the position of objects on a page using two non-interchangeable coordinate systems:

In WYSIWYG word processing, you have scrollable windows, so every coordinate has to be interpreted as either relative to the window or relative to the page, and that makes a big difference, and keeping them straight is pretty important. [...]
The compiler won’t help you if you assign one to the other and Intellisense won’t tell you bupkis. But they are semantically different; they need to be interpreted differently and treated differently and some kind of conversion function will need to be called if you assign one to the other or you will have a runtime bug. If you’re lucky. [...]
In Excel’s source code you see a lot of rw and col and when you see those you know that they refer to rows and columns. Yep, they’re both integers, but it never makes sense to assign between them. In Word, I'm told, you see a lot of xl and xw, where xl means “horizontal coordinates relative to the layout” and xw means “horizontal coordinates relative to the window.” Both ints. Not interchangeable. In both apps you see a lot of cb meaning “count of bytes.” Yep, it’s an int again, but you know so much more about it just by looking at the variable name. It’s a count of bytes: a buffer size. And if you see xl = cb, well, blow the Bad Code Whistle, that is obviously wrong code, because even though xl and cb are both integers, it’s completely crazy to set a horizontal offset in pixels to a count of bytes.

In short, Microsoft depends on coding conventions to encode contextual data about a variable, and they depend on code reviews to enforce correct usage of a variable from its context. This works in practice, but its still possible for incorrect code to work its way it the product without the bug being detected for months.

If Microsoft were using a language with units of measure, they could have defined their own rw, col, xw, xl, and cb units of measure so that an assignment of the form int<xl> = int<cb> not only fails visual inspection, it doesn't even compile.

Defining Units[edit | edit source]

New units of measure are defined using the Measure attribute:

type m (* meter *)

type s (* second *)

Additionally, we can define types measures which are derived from existing measures as well:

[<Measure>] type m                  (* meter *)
[<Measure>] type s                  (* second *)
[<Measure>] type kg                 (* kilogram *)
[<Measure>] type N = (kg * m)/(s^2) (* Newtons *)
[<Measure>] type Pa = N/(m^2)       (* Pascals *)
Important: Units of measure look like a data type, but they aren't. .NET's type system does not support the behaviors that units of measure have, such as being able to square, divide, or raise datatypes to powers. This functionality is provided by the F# static type checker at compile time, but units are erased from compiled code. Consequently, it is not possible to determine value's unit at runtime.

We can create instances of float and integer data which represent these units using the same notation we use with generics:

> let distance = 100.0<m>
let time = 5.0<s>
let speed = distance / time;;

val distance : float<m> = 100.0
val time : float<s> = 5.0
val speed : float<m/s> = 20.0

Notice the that F# automatically derives a new unit, m/s, for the value speed. Units of measure will multiply, divide, and cancel as needed depending on how they are used. Using these properties, it's very easy to convert between two units:

[<Measure>] type C
[<Measure>] type F

let to_fahrenheit (x : float<C>) = x * (9.0<F>/5.0<C>) + 32.0<F>
let to_celsius (x : float<F>) = (x - 32.0<F>) * (5.0<C>/9.0<F>)

Units of measure are statically checked at compile time for proper usage. For example, if we use a measure where it isn't expected, we get a compilation error:

> [<Measure>] type m
[<Measure>] type s

let speed (x : float<m>) (y : float<s>) = x / y;;

type m
type s
val speed : float<m> -> float<s> -> float<m/s>

> speed 20.0<m> 4.0<s>;; (* should get a speed *)
val it : float<m/s> = 5.0

> speed 20.0<m> 4.0<m>;; (* boom! *)

  speed 20.0<m> 4.0<m>;;

stdin(39,15): error FS0001: Type mismatch. Expecting a
but given a
The unit of measure 's' does not match the unit of measure 'm'

Units can be defined for integral types too:

> [<Measure>] type col
[<Measure>] type row
let colOffset (a : int<col>) (b : int<col>) = a - b
let rowOffset (a : int<row>) (b : int<row>) = a - b;;

type col
type row
val colOffset : int<col> -> int<col> -> int<col>
val rowOffset : int<row> -> int<row> -> int<row>

Dimensionless Values[edit | edit source]

A value without a unit is dimensionless. Dimensionless values are represented implicitly by writing them out without units (i.e. 7.0, -14, 200.5), or they can be represented explicitly using the <1> type (i.e. 7.0<1>, -14<1>, 200.5<1>).

We can convert dimensionless units to a specific measure by multiplying by 1<targetMeasure>. We can convert a measure back to a dimensionless unit by passing it to the built-in float or int methods:

[<Measure>] type m

(* val to_meters : (x : float<'u>) -> float<'u m> *)
let to_meters x = x * 1<m>

(* val of_meters : (x : float<m>) -> float *)
let of_meters (x : float<m>) = float x

Alternatively, its often easier (and safer) to divide away unneeded units:

let of_meters (x : float<m>) = x / 1.0<m>

Generalizing Units of Measure[edit | edit source]

Since measures and dimensionless values are (or appear to be) generic types, we can write functions which operate on both transparently:

> [<Measure>] type m
[<Measure>] type kg

let vanillaFloats = [10.0; 15.5; 17.0]
let lengths = [ for a in [2.0; 7.0; 14.0; 5.0] -> a * 1.0<m> ]
let masses = [ for a in [155.54; 179.01; 135.90] -> a * 1.0<kg> ]
let densities = [ for a in [0.54; 1.0; 1.1; 0.25; 0.7] -> a * 1.0<kg/m^3> ]

let average (l : float<'u> list) =
    let sum, count = l |> List.fold (fun (sum, count) x -> sum + x, count + 1.0<_>) (0.0<_>, 0.0<_>)
    sum / count;;

type m
type kg
val vanillaFloats : float list = [10.0; 15.5; 17.0]
val lengths : float<m> list = [2.0; 7.0; 14.0; 5.0]
val masses : float<kg> list = [155.54; 179.01; 135.9]
val densities : float<kg/m ^ 3> list = [0.54; 1.0; 1.1; 0.25; 0.7]
val average : float<'u> list -> float<'u>

> average vanillaFloats, average lengths, average masses, average densities;;
val it : float * float<m> * float<kg> * float<kg/m ^ 3> =
  (14.16666667, 7.0, 156.8166667, 0.718)

Since units are erased from compiled code, they are not considered a real data type, so they can't be used directly as a type parameter in generic functions and classes. For example, the following code will not compile:

> type triple<'a> = { a : float<'a>; b : float<'a>; c : float<'a>};;

  type triple<'a> = { a : float<'a>; b : float<'a>; c : float<'a>};;

stdin(40,31): error FS0191: Expected unit-of-measure parameter, not type parameter.
Explicit unit-of-measure parameters must be marked with the [<Measure>] attribute

F# does not infer that 'a is a unit of measure above, possibly because the following code appears correct, but it can be used in non-sensical ways:

type quad<'a> = { a : float<'a>; b : float<'a>; c : float<'a>; d : 'a}

The type 'a can be a unit of measure or a data type, but not both at the same time. F#'s type checker assumes 'a is a type parameter unless otherwise specified. We can use the [<Measure>] attribute to change the 'a to a unit of measure:

> type triple<[<Measure>] 'a> = { a : float<'a>; b : float<'a>; c : float<'a>};;

type triple<[<Measure>] 'a> =
  {a: float<'a>;
   b: float<'a>;
   c: float<'a>;}

> { a = 7.0<kg>; b = -10.5<_>; c = 0.5<_> };;
val it : triple<kg> = {a = 7.0;
                       b = -10.5;
                       c = 0.5;}

F# PowerPack[edit | edit source]

The F# PowerPack (FSharp.PowerPack.dll) includes a number of predefined units of measure for scientific applications. These are available in the following modules:

External Resources[edit | edit source]

Previous: Modules and Namespaces Index Next: Caching