F Sharp Programming/Basic Concepts

From Wikibooks, open books for an open world
Jump to: navigation, search
Previous: Getting Set Up Index Next: Values and Functions


F# : Basic Concepts


Now that we have a working installation of F# we can explore the syntax of F# and the basics of functional programming. We'll start off in the Interactive F Sharp environment as this gives us some very valuable type information, which helps get to grips with what is actually going on in F#. Open F# interactive from the start menu, or open a command-line prompt and type fsi.

Major Features[edit]

Fundamental Data Types and Type System[edit]

In computer programming, every piece of data has a type, which, predictably, describes the type of data a programmer is working with. In F#, the fundamental data types are:

F# Type .NET Type Size Range Example Represents
Integral Types
sbyte System.SByte 1 byte -128 to 127 42y
-11y
8-bit signed integer
byte System.Byte 1 byte 0 to 255 42uy
200uy
8-bit unsigned integer
int16 System.Int16 2 bytes -32768 to 32767 42s
-11s
16-bit signed integer
uint16 System.UInt16 2 bytes 0 to 65,535 42us
200us
16-bit unsigned integer
int/int32 System.Int32 4 bytes -2,147,483,648 to 2,147,483,647 42
-11
32-bit signed integer
uint32 System.UInt32 4 bytes 0 to 4,294,967,295 42u
200u
32-bit unsigned integer
int64 System.Int64 8 bytes -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 42L
-11L
64-bit signed integer
uint64 System.UInt64 8 bytes 0 to 18,446,744,073,709,551,615 42UL
200UL
64-bit unsigned integer
bigint System.Numerics.BigInteger At least 4 bytes any integer 42I
14999999999999999999999999999999I
arbitrary precision integer
Floating Point Types
float32 System.Single 4 bytes ±1.5e-45 to ±3.4e38 42.0F
-11.0F
32-bit signed floating point number (7 significant digits)
float System.Double 8 bytes ±5.0e-324 to ±1.7e308 42.0
-11.0
64-bit signed floating point number (15-16 significant digits)
decimal System.Decimal 16 bytes ±1.0e-28 to ±7.9e28 42.0M
-11.0M
128-bit signed floating point number (28-29 significant digits)
BigRational Microsoft.FSharp.Math.BigRational At least 4 bytes Any rational number. 42N
-11N
Arbitrary precision rational number. Using this type requires a reference to FSharp.PowerPack.dll.
Text Types
char System.Char 2 bytes U+0000 to U+ffff 'x'
'\t'
Single unicode characters
string System.String 20 + (2 * string's length) bytes 0 to about 2 billion characters "Hello"
"World"
Unicode text
Other Types
bool System.Boolean 1 byte Only two possible values, true or false true
false
Stores boolean values

F# is a fully object-oriented language, using an object model based on the .NET Common Language Infrastructure (CLI). As such, it has a single-inheritance, multiple interface object model, and allows programmers to declare classes, interfaces, and abstract classes. Notably, it has full support for generic class, interface, and function definitions; however, it lacks some OO features found in other languages, such as mixins and multiple inheritance.

F# also provides a unique array of data structures built directly into the syntax of the language, which include:

  • Unit, the datatype with only one value, equivalent to void in C-style languages.
  • Tuple types, which are ad hoc data structures that programmers can use to group related values into a single object.
  • Record types, which are similar to tuples, but provide named fields to access data held by the record object.
  • Discriminated unions, which are used to create very well-defined type hierarchies and hierarchical data structures.
  • Lists, Maps, and Sets, which represent immutable versions of a stack, hashtable, and set data structures, respectively.
  • Sequences, which represent a lazy list of items computed on-demand.
  • Computation expressions, which serve the same purpose as monads in Haskell, allowing programmers to write continuation-style code in an imperative style.

All of these features will be further enumerated and explained in later chapters of this book.

F# is a statically typed language, meaning that the compiler knows the datatype of variables and functions at compile time. F# is also strongly typed, meaning that a variable bound to ints cannot be rebound to strings at some later point; an int variable is forever tied to int data.

Unlike C# and VB.Net, F# does not perform implicit casts, not even safe conversions (such as converting an int to a int64). F# requires explicit casts to convert between datatypes, for example:

> let x = 5;;
 
val x : int = 5
 
> let y = 6L;;
 
val y : int64 = 6L
 
> let z = x + y;;
 
  let z = x + y;;
  ------------^
 
stdin(5,13): error FS0001: The type 'int64' does not match the type 'int'
> let z = (int64 x) + y;;
 
val z : int64 = 11L

The mathematical operators +, -, /, *, and % are overloaded to work with different datatypes, but they require arguments on each side of the operator to have the same datatype. We get an error trying to add an int to an int64, so we have to cast one of our variables above to the other's datatype before the program will successfully compile.

Type Inference[edit]

Unlike many other strongly typed languages, F# often does not require programmers to use type annotations when declaring functions and variables. Instead, F# attempts to work out the types for you, based on the way that variables are used in code.

For example, let's take this function:

let average a b = (a + b) / 2.0

We have not used any type annotations: that is, we have not explicitly told the compiler the data type of a and b, nor have we indicated the type of the function's return value. If F# is a strongly, statically typed language, how does the compiler know the datatype of anything beforehand? That's easy, it uses simple deduction:

  • The + and / operators are overloaded to work on different datatypes, but it defaults to integer addition and integer division without any extra information.
  • (a + b) / 2.0, the value in bold has the type float. Since F# doesn't perform implicit casts, and it requires arguments on both sides of a math operator to have the same datatype, the value (a + b) must return a float as well.
  • The + operator only returns float when both arguments on each side of the operator are floats, so a and b must be floats as well.
  • Finally, since the return value of float / float is float, the average function must return a float.

This process is called type-inference. On most occasions, F# will be able to work out the types of data on its own without requiring the programmer to explicitly write out type annotations. This works just as well for small programs as large programs, and it can be a tremendous time-saver.

On those occasions where F# cannot work out the types correctly, the programmer can provide explicit annotations to guide F# in the right direction. For example, as mentioned above, math operators default to operations on integers:

> let add x y = x + y;;
val add : int -> int -> int

In absence of other information, F# determines that add takes two integer and returns another integer. If we wanted to use floats instead, we'd write:

> let add (x : float) (y : float) = x + y;;
val add : float -> float -> float

Pattern Matching[edit]

F#'s pattern matching is similar to an if... then or switch construct in other languages, but is much more powerful. Pattern matching allows a programmer to decompose data structures into their component parts. It matches values based on the shape of the data structure, for example:

type Proposition =  // type with possible expressions ... note recursion for all expressions except True
    | True          // essentially this is defining boolean logic
    | Not of Proposition
    | And of Proposition * Proposition
    | Or of Proposition * Proposition
 
let rec eval x =
    match x with
    | True -> true // syntax: Pattern-to-match -> Result
    | Not(prop) -> not (eval prop)
    | And(prop1, prop2) -> (eval prop1) && (eval prop2)
    | Or(prop1, prop2) -> (eval prop1) || (eval prop2)
 
let shouldBeFalse = And(Not True, Not True)
 
let shouldBeTrue = Or(True, Not True)
 
let complexLogic = 
    And(And(True,Or(Not(True),True)),
        Or(And(True, Not(True)), Not(True)) )
 
printfn "shouldBeFalse: %b" (eval shouldBeFalse)    // prints False
printfn "shouldBeTrue: %b" (eval shouldBeTrue)      // prints True
printfn "complexLogic: %b" (eval complexLogic)      // prints False

The eval method uses pattern matching to recursively traverse and evaluate the abstract syntax tree. The rec keyword marks the function as recursive. Pattern matching will be explained in more detail in later chapters of this book.

Functional Programming Contrasted with Imperative Programming[edit]

F# is a mixed-paradigm language: it supports imperative, object-oriented, and functional styles of writing code, with heaviest emphasis on the latter.

Immutable Values vs Variables[edit]

The first mistake a newcomer to functional programming makes is thinking that the let construct is equivalent to assignment. Consider the following code:

let a = 1
(* a is now 1 *)
let a = a + 1
(* in F# this throws an error:  Duplicate definition of value 'a' *)

On the surface, this looks exactly like the familiar imperative pseudocode:

 a = 1
 // a is 1
 a = a + 1
 // a is 2

However, the nature of the F# code is very different. Every let construct introduces a new scope, and binds symbols to values in that scope. If execution escapes this introduced scope, the symbols are restored to their original meanings. This is clearly not identical to variable state mutation with assignment.

To clarify, let us desugar the F# code:

let a = 1 in  
  ((* a stands for 1 here *);
   (let a = (* a still stands for 1 here *) a + 1 in (* a stands for 2 here *));
   (* a stands for 1 here, again *))

Indeed the code

let a = 1 in  
  (printfn "%i" a;
   (let a = a + 1 in printfn "%i" a);
   printfn "%i" a)

prints out

 1 
 2
 1

Once symbols are bound to values, they cannot be assigned a new value. The only way to change the meaning of a bound symbol is to shadow it by introducing a new binding for this symbol (for example, with a let construct, as in let a = a + 1), but this shadowing will only have a localized effect: it will only affect the newly introduced scope. F# uses so-called 'lexical scoping', which simply means that one can identify the scope of a binding by simply looking at the code. Thus the scope of the let a = a + 1 binding in (let a = a + 1 in ..) is limited by the parentheses. With lexical scoping, there is no way for a piece of code to change the value of a bound symbol outside of itself, such as in the code that has called it.

Immutability is a great concept. Immutability allows programmers to pass values to functions without worrying that the function will change the value's state in unpredictable ways. Additionally, since value can't be mutated, programmers can process data shared across many threads without fear that the data will be mutated by another process; as a result, programmers can write multithreaded code without locks, and a whole class of errors related to race conditions and dead locking can be eliminated.

Functional programmers generally simulate state by passing extra parameters to functions; objects are "mutated" by creating an entirely new instance of an object with the desired changes and letting the garbage collector throw away the old instances if they are not needed. The resource overheads this style implies are dealt with by sharing structure. For example, changing the head of a singly-linked list of 1000 integers can be achieved by allocating a single new integer, reusing the tail of the original list (of length 999).

For the rare cases when mutation is really needed (for example, in number-crunching code which is a performance bottleneck), F# offers reference types and .NET mutable collections (such as arrays).

Recursion or Loops?[edit]

Imperative programming languages tend to iterate through collections with loops:

void ProcessItems(Item[] items)
{
    for(int i = 0; i < items.Length; i++)
    {
        Item myItem = items[i];
        proc(myItem); // process myItem
    }
}

This admits a direct translation to F# (type annotations for i and item are omitted because F# can infer them):

let processItems (items : Item []) = 
  for i in 0 .. items.Length do
    let item = items.[i] in
      proc item

However, the above code is clearly not written in a functional style. One problem with it is that it traverses an array of items. For many purposes including enumeration, functional programmers would use a different data structure, a singly linked list. Here is an example of iterating over this data structure with pattern matching:

let rec processItems = function
  | []       -> ()  // empty list: end recursion
  | head :: tail -> // split list in first item (head) and rest (tail)
      proc head;
      processItems tail // recursively enumerate list

It is important to note that because the recursive call to processItems appears as the last expression in the function, this is an example of so-called tail recursion. The F# compiler recognizes this pattern and compiles processItems to a loop. The processItems function therefore runs in constant space and does not cause stack overflows.

F# programmers rely on tail recursion to structure their programs whenever this technique contributes to code clarity.

A careful reader has noticed that in the above example proc function was coming from the environment. The code can be improved and made more general by parameterizing it by this function (making proc a parameter):

let rec processItems proc = function
  | []       -> ()
  | hd :: tl ->
      proc hd;
      processItems proc tl // recursively enumerate list

This processItems function is indeed so useful that it has made it into the standard library under the name of List.iter.

For the sake of completeness it must be mentioned that F# includes generic versions of List.iter called Seq.iter (other List.* functions usually have Seq.* counterparts as well) that works on lists, arrays, and all other collections. F# also includes a looping construct that works for all collections implementing the System.Collections.Generic.IEnumerable:

for item in collection do 
  process item

Function Composition Rather than Inheritance[edit]

Traditional OO uses implementation inheritance extensively; in other words, programmers create base classes with partial implementation, then build up object hierarchies from the base classes while overriding members as needed. This style has proven to be remarkably effective since the early 1990s, however this style is not contiguous with functional programming.

Functional programming aims to build simple, composable abstractions. Since traditional OO can only make an object's interface more complex, not simpler, inheritance is rarely used at all in F#. As a result, F# libraries tend to have fewer classes and very "flat" object hierarchies, as opposed to very deep and complex hierarchies found in equivalent Java or C# applications.

F# tends to rely more on object composition and delegation rather than inheritance to share snippets of implementation across modules.

Functions as First-Order Types[edit]

F# is a functional programming language, meaning that functions are first-order data types: they can be declared and used in exactly the same way that any other variable can be used.

In an imperative language like Visual Basic there is a fundamental difference between variables and functions.

Dim myVal as Integer
Dim myParam as Integer
myParam = 2
Public Function MyFunc(Dim param as Integer)
    MyFunc = (param * 2) + 7
End Function
myVal = MyFunc(myParam)

Notice the difference in syntax between defining and evaluating a function and defining and assigning a variable. In the preceding Visual Basic code we could perform a number of different actions with a variable we can:

  • create a token (the variable name) and associate it with a type
  • assign it a value
  • interrogate its value
  • pass it into a function or sub-routine (which is essentially a function that returns no value)
  • return it from a function

Functional programming makes no distinction between values and functions, so we can consider functions to be equal to all other data types. That means that we can:

  • create a token (the function variable name) and associate it with a type
  • assign it a value (the actual calculation)
  • interrogate its value (perform the calculation)
  • pass a function as a parameter of another function or sub-routine
  • return a function as the result of another function

Structure of F# Programs[edit]

A simple, non-trivial F# program has the following parts:

open System
 
(* This is a
multi-line comment *)
 
// This is a single-line comment
 
let rec fib = function
    | 0 -> 0
    | 1 -> 1
    | n -> fib (n - 1) + fib (n - 2)
 
let main() =
    Console.WriteLine("fib 5: {0}", (fib 5))
 
main()

Most F# code files begin with a number of open statements used to import namespaces, allowing programmers to reference classes in namespaces without having to write fully qualified type declarations. This keyword is functionally equivalent to the using directive in C# and Imports directive in VB.Net. For example, the Console class is found under the System namespace; without importing the namespace, a programmer would need to access the Console class through its fully qualified name, System.Console.

The body of the F# file usually contains functions to implement the business logic in an application.

Finally, many F# application exhibit this pattern:

let main() =
    (*
       ...
       This is the main loop.
       ...
    *)
 
main()
(*
   This is a top-level statement because it's
   not nested in any other functions. This calls 
   into the main method to run the main loop.
*)


In general, there is no explicit entry point in an F# application. Rather, when an F# application is compiled, the last file passed to the compiler is assumed to be the entry point, and the compiler executes all top-level statements in the file from top to bottom. While its possible to have any number of top-level statements in the entry point of a F# program, well-written F# only has a single top-level statement which calls the main loop of the application.

Previous: Getting Set Up Index Next: Values and Functions