Haskell/Libraries/Random

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Random examples[edit | edit source]

Random numbers have many uses.

Example: Ten random integer numbers

import System.Random

main = do
   gen <- newStdGen
   let ns = randoms gen :: [Int]
   print $ take 10 ns

The IO action newStdGen creates a new StdGen pseudorandom number generator state. This StdGen can be passed to functions which need to generate pseudorandom numbers.

(There also exists a global random number generator which is initialized automatically in a system dependent fashion. This generator is maintained in the IO monad and can be accessed with getStdGen. This is perhaps a library wart, however, as all that you really ever need is newStdGen.)

Alternatively, one can using mkStdGen:

Example: Ten random floats using mkStdGen

import System.Random

randomList :: (Random a) => Int -> [a]
randomList seed = randoms (mkStdGen seed)

main :: IO ()
main = do print $ take 10 (randomList 42 :: [Float])

Running this script results in output like this:

[0.110407025,0.8453985,0.3077821,0.78138804,0.5242582,0.5196911,0.20084688,0.4794773,0.3240164,6.1566383e-2]

Example: Unsorting a list (imperfectly)

import Data.List ( sortBy )
import Data.Ord ( comparing )
import System.Random ( Random, RandomGen, randoms, newStdGen )

main :: IO ()
main =
 do gen <- newStdGen
    interact $ unlines . unsort gen . lines

unsort :: (RandomGen g) => g -> [x] -> [x]
unsort g es = map snd . sortBy (comparing fst) $ zip rs es
  where rs = randoms g :: [Integer]

There's more to random number generation than randoms. For example, you can use random (sans 's') to generate a single random number along with a new StdGen to be used for the next one. Also, randomR and randomRs take a parameter for specifying the range. See below for more ideas.

The Standard Random Number Generator[edit | edit source]

The Haskell standard random number functions and types are defined in the System.Random module. The definition of random is a bit tricky to follow because it uses classes to make itself more general.

From the standard:

 ---------------- The RandomGen class ------------------------
 
 class RandomGen g where
   genRange :: g -> (Int, Int)
   next     :: g -> (Int, g)
   split    :: g -> (g, g)
 
 ---------------- A standard instance of RandomGen -----------
 data StdGen = ... -- Abstract

This basically introduces StdGen, the standard random number generator "object". It's an instance of the RandomGen class which specifies the operations other pseudorandom number generator libraries need to implement to be used with the System.Random library.

Given r :: StdGen, you can say:

(x, r2) = next r

That gives you a random Int x and a new StdGen r2. The next function is defined in the RandomGen class, and you can apply it to something of type StdGen because StdGen is an instance of the RandomGen class, as below.

From the Standard:

instance RandomGen StdGen where ...
instance Read      StdGen where ...
instance Show      StdGen where ...

This also says that you can convert a StdGen to and from a string. (The dots are not Haskell syntax; they simply say that the Standard does not define an implementation of these instances.)

From the Standard:

mkStdGen :: Int -> StdGen

Put in a seed Int to the mkStdGen function, you'll get out a generator.

As a functional language, Haskell returns a new random number generator with the next. In languages with mutable variables, the random number generator routine has the hidden side effect of updating the state of the generator ready for the next call. Haskell won't do that. If you want to generate three random numbers in Haskell, you need to say something like:

let
   (x1, r2) = next r
   (x2, r3) = next r2
   (x3, r4) = next r3

The random values (x1, x2, x3) are themselves random integers. To get something in the range, say, (0,999) there ought to be a library routine built on this, and indeed there is:

From the Standard:

---------------- The Random class ---------------------------
class Random a where
   randomR :: RandomGen g => (a, a) -> g -> (a, g)
   random  :: RandomGen g => g -> (a, g)

   randomRs :: RandomGen g => (a, a) -> g -> [a]
   randoms  :: RandomGen g => g -> [a]

   randomRIO :: (a,a) -> IO a
   randomIO  :: IO a

Remember that StdGen is the only instance of type RandomGen (unless you roll your own random number generator). Therefore, you can substitute StdGen for 'g' in the types above and get this:

   randomR :: (a, a) -> StdGen -> (a, StdGen)
   random  :: StdGen -> (a, StdGen)

   randomRs :: (a, a) -> StdGen -> [a]
   randoms  :: StdGen -> [a]

But remember that this is all inside *another* class declaration "Random". This all says: any instance of Random can use these functions. The instances of Random in the Standard are:

instance Random Integer where ...
instance Random Float   where ...
instance Random Double  where ...
instance Random Bool    where ...
instance Random Char    where ...

So for any of these types you can get a random range. You can get a random integer with:

(x1, r2) = randomR (0,999) r

And you can get a random upper case character with

(c2, r3) = randomR ('A', 'Z') r2

You can even get a random bit with

(b3, r4) = randomR (False, True) r3

So far so good, but threading the random number state through your entire program like this is painful, error prone, and generally destroys the nice clean functional properties of your program.

One partial solution is the "split" function in the RandomGen class. It takes one generator and gives you two generators back. That lets you say something like this:

(r1, r2) = split r
x = foo r1

In this case, we are passing r1 down into function foo, which does something random with it and returns a result "x". We can then take "r2" as the random number generator for whatever comes next. Without "split" we would have to write

(x, r2) = foo r1

Yet this too is often clumsy. We can do it the quick and dirty way by putting the whole thing in the IO monad. Thus, we get a standard global random number generator just like any other language.

From the Standard:

---------------- The global random generator ---------------- 
newStdGen    :: IO StdGen 
setStdGen    :: StdGen -> IO ()
getStdGen    :: IO StdGen
getStdRandom :: (StdGen -> (a, StdGen)) -> IO a

We could write:

foo :: IO Int
foo = do
   r1 <- getStdGen
   let (x, r2) = randomR (0,999) r1
   setStdGen r2
   return x

That gets the global generator, uses it, and then updates it (otherwise every random number will be the same). But having to get and update the global generator every time you use it is a pain, so its more common to use getStdRandom. The argument to this is a function. Compare the type of that function to that of 'random' and 'randomR'. They both fit in rather well. To get a random integer in the IO monad you can say:

x <- getStdRandom $ randomR (1,999)

The 'randomR (1,999) has type StdGen -> (Int, StdGen), so it fits straight into the argument required by getStdRandom.

Using QuickCheck to Generate Random Data[edit | edit source]

Only being able to do random numbers via the IO monad is a bit of a pain. You find that some function deep inside your code needs a random number, and suddenly you have to rewrite half your program as IO actions instead of nice pure functions, or you need a StdGen parameter to tramp its way down there through all the higher level functions. We would prefer something a bit purer.

Recall from the State monad chapter, that patterns like:

let
   (x1, r2) = next r
   (x2, r3) = next r2
   (x3, r4) = next r3

Can be done with "do" notation:

   do -- Not real Haskell
      x1 <- random
      x2 <- random
      x3 <- random

Of course, you can do this in the IO monad, but it would be better if random numbers had their own little monad that specialized in random computations. And it just so happens that such a monad exists. It lives in the Test.QuickCheck module, and it's called Gen.

The reason that Gen lives in Test.QuickCheck is historical: that is where it was invented. The purpose of QuickCheck is to generate random unit tests to verify properties of your code. (Incidentally, QuickCheck works wonderfully, and most Haskell developers use it for testing). See the Introduction to QuickCheck on the HaskellWiki for more details. This tutorial will concentrate on using the Gen monad for generating random data.

To use QuickCheck modules, you will need to install the QuickCheck package. With that done, just put

import Test.QuickCheck

in your source file.

The Gen monad can be thought of as a monad of random computations. As well as generating random numbers, it provides a library of functions that build up complicated values out of simple ones.

So lets start with a routine to return three random integers between 0 and 999:

randomTriple :: Gen (Integer, Integer, Integer)
randomTriple = do
   x1 <- choose (0,999)
   x2 <- choose (0,999)
   x3 <- choose (0,999)
   return (x1, x2, x3)

choose is one of the functions from QuickCheck. Its the equivalent to randomR.

choose :: Random a => (a, a) -> Gen a

In other words, for any type "a" which is an instance of "Random" (see above), choose will map a range into a generator.

Once you have a Gen action you have to execute it.

The unGen action executes an action and returns the random result:

unGen :: Gen a -> StdGen -> Int -> a

The three arguments are:

  1. The generator action.
  2. A random number generator.
  3. The "size" of the result. This isn't used in the example above, but if you were generating a data structure with a variable number of elements (like a list) then this parameter lets you pass some notion of the expected size into the generator. We'll see an example later.

So for example, this generates three arbitrary numbers:

let
   triple = unGen randomTriple (mkStdGen 1) 1

But the numbers will always be the same because the same seed value is used! If you want different numbers then you have to use a different StdGen argument.

A common pattern in most programming languages involves a random number generator choosing between two courses of action:

-- Not Haskell code
r := random (0,1)
if r == 1 then foo else bar

QuickCheck provides a more declarative way of doing the same thing. If foo and bar are both generators returning the same type then we can say:

oneof [foo, bar]

This has an equal chance of returning either foo or "bar If you wanted different odds, then you could say something like:

frequency [ (30, foo), (70, bar) ]

oneof takes a simple list of Gen actions and selects one of them at random. frequency does something similar, but the probability of each item is given by the associated weighting.

oneof :: [Gen a] -> Gen a
frequency :: [(Int, Gen a)] -> Gen a