Haskell/FFI

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Using Haskell is fine, but in the real world there are a large number of useful libraries in other languages, especially C. To use these libraries, and let C code use Haskell functions, there is the Foreign Function Interface (FFI).

Calling C from Haskell[edit | edit source]

Marshalling (Type Conversion)[edit | edit source]

When using C functions, it is necessary to convert Haskell types to the appropriate C types. These are available in the Foreign.C.Types module; some examples are given in the following table.

Haskell Foreign.C.Types C
Double CDouble double
Char CUChar unsigned char
Int CLong long int

The operation of converting Haskell types into C types is called marshalling (and the opposite, predictably, unmarshalling). For basic types this is quite straightforward: for floating-point one uses realToFrac (either way, as e.g. both Double and CDouble are instances of classes Real and Fractional), for integers fromIntegral, and so on.

Calling a pure C function[edit | edit source]

A pure function implemented in C does not present significant trouble in Haskell. The sin function of the C standard library is a fine example:

{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign
import Foreign.C.Types

foreign import ccall unsafe "math.h sin"
     c_sin :: CDouble -> CDouble

First, we specify a GHC extension for the FFI in the first line. We then import the Foreign and Foreign.C.Types modules, the latter of which contains information about CDouble, the representation of double-precision floating-point numbers in C.

We then specify that we are importing a foreign function, with a call to C. A "safety level" has to be specified with the keyword safe (the default) or unsafe. In general, unsafe is more efficient, and safe is required only for C code that could call back a Haskell function. Since that is a very particular case, it is actually quite safe to use the unsafe keyword in most cases. Finally, we need to specify header and function name, separated by a space.

The Haskell function name is then given, in our case we use a standard c_sin, but it could have been anything. Note that the function signature must be correct—GHC will not check the C header to confirm that the function actually takes a CDouble and returns another, and writing a wrong one could have unpredictable results.

It is then possible to generate a wrapper around the function using CDouble so that it looks exactly like any Haskell function.

haskellSin :: Double -> Double
haskellSin = realToFrac . c_sin . realToFrac

Importing C's sin is simple because it is a pure function that takes a plain double as input and returns another as output: things will complicate with impure functions and pointers, which are ubiquitous in more complicated C libraries.

Impure C Functions[edit | edit source]

A classic impure C function is rand, for the generation of pseudo-random numbers. Suppose you do not want to use Haskell's System.Random.randomIO, for example because you want to replicate exactly the series of pseudo-random numbers output by some C routine. Then, you could import it just like sin before:

{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign
import Foreign.C.Types

foreign import ccall unsafe "stdlib.h rand"
     c_rand :: CUInt -- Oops!

If you try this naïve implementation in GHCI, you will notice that c_rand is returning always the same value:

> c_rand
1714636915
> c_rand
1714636915

indeed, we have told GHC that it is a pure function, and GHC sees no point in calculating twice the result of a pure function. Note that GHC did not give any error or warning message.

In order to make GHC understand this is no pure function, we have to use the IO monad:

{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign
import Foreign.C.Types

foreign import ccall unsafe "stdlib.h rand"
     c_rand :: IO CUInt

foreign import ccall "stdlib.h srand"
     c_srand :: CUInt -> IO ()

Here, we also imported the srand function, to be able to seed the C pseudo-random generator.

> c_rand
1957747793
> c_rand
424238335
> c_srand 0
> c_rand
1804289383
> c_srand 0
> c_rand
1804289383

Working with C Pointers[edit | edit source]

The most useful C functions are often those that do complicated calculations with several parameters, and with increasing complexity the need of returning control codes arises. This means that a typical paradigm of C libraries is to give pointers of allocated memory as "targets" in which the results may be written, while the function itself returns an integer value (typically, if 0, computation was successful, otherwise there was a problem specified by the number). Another possibility is that the function will return a pointer to a structure (possibly defined in the implementation, and therefore unavailable to us).

As a pedagogical example, we consider the gsl_frexp function of the GNU Scientific Library, a freely available library for scientific computation. It is a simple C function with prototype:

double gsl_frexp (double x, int * e)

The function takes a double x, and it returns its normalised fraction f and integer exponent e so that:

We interface this C function into Haskell with the following code:

{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign
import Foreign.Ptr
import Foreign.C.Types
import System.IO.Unsafe         -- for unsafePerformIO

foreign import ccall unsafe "gsl/gsl_math.h gsl_frexp"
     gsl_frexp :: CDouble -> Ptr CInt -> IO CDouble

The new part is Ptr, which can be used with any instance of the Storable class, among which all C types, but also several Haskell types.

Notice how the result of the gsl_frexp function is in the IO monad. This is typical when working with pointers, be they used for input or output (as in this case); we will see shortly what would happen had we used a simple CDouble for the function.

The frexp function is implemented in pure Haskell code as follows:

frexp :: Double -> (Double, Int)
frexp x = unsafePerformIO $
    alloca $ \expptr -> do
        f <- gsl_frexp (realToFrac x) expptr
        e <- peek expptr
        return (realToFrac f, fromIntegral e)

We know that, memory management details aside, the function is pure: that's why the signature returns a tuple with f and e outside of the IO monad. Yet, f is provided inside of it: to extract it, we use the function unsafePerformIO, which extracts values from the IO monad: obviously, it is legitimate to use it only when we know the function is pure, and we can allow GHC to optimise accordingly.

To allocate pointers, we use the alloca function, which also takes responsibility for freeing memory. As an argument, alloca takes a function of type Ptr a -> IO b, and returns the IO b. In practice, this translates to the following usage pattern with λ functions:

... alloca $ \pointer -> do
        c_function argument pointer
        result <- peek pointer
        return result

The pattern can easily be nested if several pointers are required:

... alloca $ \firstPointer ->
        alloca $ \secondPointer -> do
            c_function argument firstPointer secondPointer
            first  <- peek firstPointer
            second <- peek secondPointer
            return (first, second)

Back to our frexp function: in the λ function that is the argument to alloca, the function is evaluated and the pointer is read immediately afterwards with peek. Here we can understand why we wanted the imported C function gsl_frexp to return a value in the IO monad: if GHC could decide when to calculate the quantity f, it would likely decide not to do it until it is necessary: that is at the last line when return uses it, and after e has been read from an allocated, but yet uninitialised memory address, which will contain random data. In short, we want gsl_frexp to return a monadic value because we want to determine the sequence of computations ourselves.

If some other function had required a pointer to provide input instead of storing output, one would have used the similar poke function to set the pointed value, obviously before evaluating the function:

... alloca $ \inputPointer ->
        alloca $ \outputPointer -> do
            poke inputPointer value
            c_function argument inputPointer outputPointer
            result <- peek outputPointer
            return result

In the final line, the results are arranged in a tuple and returned, after having been converted from C types.

To test the function, remember to link GHC to the GSL; in GHCI, do:

$ ghci frexp.hs -lgsl

(Note that most systems do not come with the GSL preinstalled, and you may have to download and install its development packages.)

Working with C Structures[edit | edit source]

Very often data are returned by C functions in form of structs or pointers to these. In some rare cases, these structures are returned directly, but more often they are returned as pointers; the return value is most often an int that indicates the correctness of execution.

We will consider another GSL function, gsl_sf_bessel_Jn_e. This function provides the regular cylindrical Bessel function for a given order n, and returns the result as a gsl_sf_result structure pointer. The structure contains two doubles, one for the result and one for the error. The integer error code returned by the function can be transformed in a C string by the function gsl_strerror. The signature of the Haskell function we are looking for is therefore:

BesselJn :: Int -> Double -> Either String (Double, Double)

where the first argument is the order of the cylindrical Bessel function, the second is the function's argument, and the returned value is either an error message or a tuple with result and margin of error.

Making a New Instance of the Storable class[edit | edit source]

In order to allocate and read a pointer to a gsl_sf_result structure, it is necessary to make it an instance of the Storable class.

In order to do that, it is useful to use the hsc2hs program: we create first a Bessel.hsc file, with a mixed syntax of Haskell and C macros, which is later expanded into Haskell by the command:

$ hsc2hs Bessel.hsc

After that, we simply load the Bessel.hs file in GHC.

This is the first part of file Bessel.hsc:

{-# LANGUAGE ForeignFunctionInterface #-}

module Bessel (besselJn) where

import Foreign
import Foreign.Ptr
import Foreign.C.String
import Foreign.C.Types

#include <gsl/gsl_sf_result.h>

data GslSfResult = GslSfResult { gsl_value :: CDouble, gsl_error :: CDouble }

instance Storable GslSfResult where
    sizeOf    _ = (#size gsl_sf_result)
    alignment _ = alignment (undefined :: CDouble)
    peek ptr = do
        value <- (#peek gsl_sf_result, val) ptr
        error <- (#peek gsl_sf_result, err) ptr
        return  GslSfResult { gsl_value = value, gsl_error = error }
    poke ptr (GslSfResult value error) = do
        (#poke gsl_sf_result, val) ptr value
        (#poke gsl_sf_result, err) ptr error

We use the #include directive to make sure hsc2hs knows where to find information about gsl_sf_result. We then define a Haskell data structure mirroring the GSL's, with two CDoubles: this is the class we make an instance of Storable. Strictly, we need only sizeOf, alignment and peek for this example; poke is added for completeness.

  • sizeOf is obviously fundamental to the allocation process, and is calculated by hsc2hs with the #size macro.
  • alignment is the size in bytes of the data structure alignment. In general, it should be the largest alignment of the elements of the structure; in our case, since the two elements are the same, we simply use CDouble's. The value of the argument to alignment is inconsequential, what is important is the type of the argument.
  • peek is implemented using a do-block and the #peek macros, as shown. val and err are the names used for the structure fields in the GSL source code.
  • Similarly, poke is implemented with the #poke macro.

Importing the C Functions[edit | edit source]

foreign import ccall unsafe "gsl/gsl_bessel.h gsl_sf_bessel_Jn_e"
     c_besselJn :: CInt -> CDouble -> Ptr GslSfResult -> IO CInt

foreign import ccall unsafe "gsl/gsl_errno.h gsl_set_error_handler_off"
     c_deactivate_gsl_error_handler :: IO ()

foreign import ccall unsafe "gsl/gsl_errno.h gsl_strerror"
     c_error_string :: CInt -> IO CString

We import several functions from the GSL libraries: first, the Bessel function itself, which will do the actual work. Then, we need a particular function, gsl_set_error_handler_off, because the default GSL error handler will simply crash the program, even if called by Haskell: we, instead, plan to deal with errors ourselves. The last function is the GSL-wide interpreter that translates error codes in human-readable C strings.

Implementing the Bessel Function[edit | edit source]

Finally, we can implement the Haskell version of the GSL cylindrical Bessel function of order n.

besselJn :: Int -> Double -> Either String (Double, Double)
besselJn n x = unsafePerformIO $
    alloca $ \gslSfPtr -> do
        c_deactivate_gsl_error_handler
        status <- c_besselJn (fromIntegral n) (realToFrac x) gslSfPtr
        if status == 0
            then do
                GslSfResult val err <- peek gslSfPtr
                return $ Right (realToFrac val, realToFrac err)
            else do
                error <- c_error_string status
                error_message <- peekCString error
                return $ Left ("GSL error: "++error_message)

Again, we use unsafePerformIO because the function is pure, even though its nuts-and-bolts implementation is not. After allocating a pointer to a GSL result structure, we deactivate the GSL error handler to avoid crashes in case something goes wrong, and finally we can call the GSL function. At this point, if the status returned by the function is 0, we unmarshal the result and return it as a tuple. Otherwise, we call the GSL error-string function, and pass the error as a Left result instead.

Examples[edit | edit source]

Once we are finished writing the Bessel.hsc function, we have to convert it to proper Haskell and load the produced file:

$ hsc2hs Bessel.hsc
$ ghci Bessel.hs -lgsl

We can then call the Bessel function with several values:

> besselJn 0 10
Right (-0.2459357644513483,1.8116861737200453e-16)
> besselJn 1 0
Right (0.0,0.0)
> besselJn 1000 2
Left "GSL error: underflow"

Advanced Topics[edit | edit source]

This section contains an advanced example with some more complex features of the FFI. We will import into Haskell one of the more complicated functions of the GSL, the one used to calculate the integral of a function between two given points with an adaptive Gauss-Kronrod algorithm. The GSL function is gsl_integration_qag.

This example will illustrate function pointers, export of Haskell functions to C routines, enumerations, and handling pointers of unknown structures.

Available C Functions and Structures[edit | edit source]

The GSL has three functions which are necessary to integrate a given function with the considered method:

gsl_integration_workspace * gsl_integration_workspace_alloc (size_t n);
void gsl_integration_workspace_free (gsl_integration_workspace * w);
int gsl_integration_qag (const gsl_function * f, double a, double b, 
                         double epsabs, double epsrel, size_t limit, 
                         int key, gsl_integration_workspace * workspace, 
                         double * result, double * abserr);

The first two deal with allocation and deallocation of a "workspace" structure of which we know nothing (we just pass a pointer around). The actual work is done by the last function, which requires a pointer to a workspace.

To provide functions, the GSL specifies an appropriate structure for C:

struct gsl_function
{
  double (* function) (double x, void * params);
  void * params;
};

The reason for the void pointer is that it is not possible to define λ functions in C, so the function cannot be partially applied with some general parameters independent of x and these parameters are therefore passed in a pointer of an unknown type. In Haskell, we do not need the params element, and will consistently ignore it.

Imports and Inclusions[edit | edit source]

We start our qag.hsc file with the following:

{-# LANGUAGE ForeignFunctionInterface, EmptyDataDecls #-}

module Qag ( qag,
             gauss15,
             gauss21,
             gauss31,
             gauss41,
             gauss51,
             gauss61 ) where

import Foreign
import Foreign.Ptr
import Foreign.C.Types
import Foreign.C.String

#include <gsl/gsl_math.h>
#include <gsl/gsl_integration.h>

foreign import ccall unsafe "gsl/gsl_errno.h gsl_strerror"
     c_error_string :: CInt -> IO CString

foreign import ccall unsafe "gsl/gsl_errno.h gsl_set_error_handler_off"
    c_deactivate_gsl_error_handler :: IO ()

We declare the EmptyDataDecls pragma, which we will use later for the Workspace data type. Since this file will have a good number of functions that should not be available to the outside world, we also declare it a module and export only the final function qag and the gauss- flags. We also include the relevant C headers of the GSL. The import of C functions for error messages and deactivation of the error handler was described before.

Enumerations[edit | edit source]

One of the arguments of gsl_integration_qag is key, an integer value that can have values from 1 to 6 and indicates the integration rule. GSL defines a macro for each value, but in Haskell it is more appropriate to define a type, which we call IntegrationRule. Also, to have its values automatically defined by hsc2hs, we can use the enum macro:

newtype IntegrationRule = IntegrationRule { rule :: CInt }
#{enum IntegrationRule, IntegrationRule,
    gauss15 = GSL_INTEG_GAUSS15,
    gauss21 = GSL_INTEG_GAUSS21,
    gauss31 = GSL_INTEG_GAUSS31,
    gauss41 = GSL_INTEG_GAUSS41,
    gauss51 = GSL_INTEG_GAUSS51,
    gauss61 = GSL_INTEG_GAUSS61
  }

hsc2hs will search the headers for the macros and give our variables the correct values. The enum directive will define a function with an appropriate type signature for each of the enum values. The above example will get translated to something like this (with the C macros appropriately replaced by their values):

newtype IntegrationRule = IntegrationRule { rule :: CInt }

gauss15  :: IntegrationRule
gauss15  = IntegrationRule GSL_INTEG_GAUSS15
gauss21  :: IntegrationRule
gauss21  = IntegrationRule GSL_INTEG_GAUSS21
.
.
.

The variables cannot be modified and are essentially constant flags. Since we did not export the IntegrationRule constructor in the module declaration, but only the gauss flags, it is impossible for a user to even construct an invalid value. One thing less to worry about!

Haskell Function Target[edit | edit source]

We can now write down the signature of the function we desire:

qag :: IntegrationRule                 -- Algorithm type
    -> Int                             -- Step limit
    -> Double                          -- Absolute tolerance
    -> Double                          -- Relative tolerance
    -> (Double -> Double)              -- Function to integrate
    -> Double                          -- Integration interval start
    -> Double                          -- Integration interval end
    -> Either String (Double, Double)  -- Result and (absolute) error estimate

Note how the order of arguments is different from the C version: indeed, since C does not have the possibility of partial application, the ordering criteria are different than in Haskell.

As in the previous example, we indicate errors with a Either String (Double, Double) result.

Passing Haskell Functions to the C Algorithm[edit | edit source]

type CFunction = CDouble -> Ptr () -> CDouble

data GslFunction = GslFunction (FunPtr CFunction) (Ptr ())
instance Storable GslFunction where
    sizeOf    _ = (#size gsl_function)
    alignment _ = alignment (undefined :: Ptr ())
    peek ptr = do
        function <- (#peek gsl_function, function) ptr
        return $ GslFunction function nullPtr
    poke ptr (GslFunction fun nullPtr) = do
        (#poke gsl_function, function) ptr fun

makeCfunction :: (Double -> Double) -> (CDouble -> Ptr () -> CDouble)
makeCfunction f = \x voidpointer -> realToFrac $ f (realToFrac x)

foreign import ccall "wrapper"
    makeFunPtr :: CFunction -> IO (FunPtr CFunction)

We define a shorthand type, CFunction, for readability. Note that the void pointer has been translated to a Ptr (), since we have no intention of using it. Then it is the turn of the gsl_function structure: no surprises here. Note that the void pointer is always assumed to be null, both in peek and in poke, and is never really read nor written.

To make a Haskell Double -> Double function available to the C algorithm, we make two steps: first, we re-organise the arguments using a λ function in makeCfunction; then, in makeFunPtr, we take the function with reordered arguments and produce a function pointer that we can pass on to poke, so we can construct the GslFunction data structure.

Handling Unknown Structures[edit | edit source]

data Workspace
foreign import ccall unsafe "gsl/gsl_integration.h gsl_integration_workspace_alloc"
    c_qag_alloc :: CSize -> IO (Ptr Workspace)
foreign import ccall unsafe "gsl/gsl_integration.h gsl_integration_workspace_free"
    c_qag_free  :: Ptr Workspace -> IO ()

foreign import ccall safe "gsl/gsl_integration.h gsl_integration_qag"
    c_qag :: Ptr GslFunction -- Allocated GSL function structure
          -> CDouble -- Start interval
          -> CDouble -- End interval
          -> CDouble -- Absolute tolerance
          -> CDouble -- Relative tolerance
          -> CSize   -- Maximum number of subintervals
          -> CInt    -- Type of Gauss-Kronrod rule
          -> Ptr Workspace -- GSL integration workspace
          -> Ptr CDouble -- Result
          -> Ptr CDouble -- Computation error
          -> IO CInt -- Exit code

The reason we imported the EmptyDataDecls pragma is this: we are declaring the data structure Workspace without providing any constructor. This is a way to make sure it will always be handled as a pointer, and never actually instantiated.

Otherwise, we normally import the allocating and deallocating routines. We can now import the integration function, since we have all the required pieces (GslFunction and Workspace).

The Complete Function[edit | edit source]

It is now possible to implement a function with the same functionality as the GSL's QAG algorithm.

qag gauss steps abstol reltol f a b = unsafePerformIO $ do
    c_deactivate_gsl_error_handler
    workspacePtr <- c_qag_alloc (fromIntegral steps)
    if workspacePtr == nullPtr
        then
            return $ Left "GSL could not allocate workspace"
        else do
            fPtr <- makeFunPtr $ makeCfunction f
            alloca $ \gsl_f -> do
                poke gsl_f (GslFunction fPtr nullPtr)
                alloca $ \resultPtr -> do
                    alloca $ \errorPtr -> do
                        status <- c_qag gsl_f
                                        (realToFrac a)
                                        (realToFrac b)
                                        (realToFrac abstol)
                                        (realToFrac reltol)
                                        (fromIntegral steps)
                                        (rule gauss)
                                        workspacePtr
                                        resultPtr
                                        errorPtr
                        c_qag_free workspacePtr
                        freeHaskellFunPtr fPtr
                        if status /= 0
                            then do
                                c_errormsg <- c_error_string status
                                errormsg   <- peekCString c_errormsg
                                return $ Left errormsg
                            else do
                                c_result <- peek resultPtr
                                c_error  <- peek  errorPtr
                                let result = realToFrac c_result
                                let error  = realToFrac c_error
                                return $ Right (result, error)

First and foremost, we deactivate the GSL error handler, that would crash the program instead of letting us report the error.

We then proceed to allocate the workspace; notice that, if the returned pointer is null, there was an error (typically, too large size) that has to be reported.

If the workspace was allocated correctly, we convert the given function to a function pointer and allocate the GslFunction struct, in which we place the function pointer. Allocating memory for the result and its error margin is the last thing before calling the main routine.

After calling, we have to do some housekeeping and free the memory allocated by the workspace and the function pointer. Note that it would be possible to skip the bookkeeping using ForeignPtr, but the work required to get it to work is more than the effort to remember one line of cleanup.

We then proceed to check the return value and return the result, as was done for the Bessel function.

Self-Deallocating Pointers[edit | edit source]

In the previous example, we manually handled the deallocation of the GSL integration workspace, a data structure we know nothing about, by calling its C deallocation function. It happens that the same workspace is used in several integration routines, which we may want to import in Haskell.

Instead of replicating the same allocation/deallocation code each time, which could lead to memory leaks when someone forgets the deallocation part, we can provide a sort of "smart pointer", which will deallocate the memory when it is not needed any more. This is called ForeignPtr (do not confuse with Foreign.Ptr: this one's qualified name is actually Foreign.ForeignPtr!). The function handling the deallocation is called the finalizer.

In this section we will write a simple module to allocate GSL workspaces and provide them as appropriately configured ForeignPtrs, so that users do not have to worry about deallocation.

The module, written in file GSLWorkspace.hs, is as follows:

{-# LANGUAGE ForeignFunctionInterface, EmptyDataDecls #-}

module GSLWorkSpace (Workspace, createWorkspace) where

import Foreign.C.Types
import Foreign.Ptr
import Foreign.ForeignPtr

data Workspace
foreign import ccall unsafe "gsl/gsl_integration.h gsl_integration_workspace_alloc"
    c_ws_alloc :: CSize -> IO (Ptr Workspace)
foreign import ccall unsafe "gsl/gsl_integration.h &gsl_integration_workspace_free"
    c_ws_free  :: FunPtr( Ptr Workspace -> IO () )

createWorkspace :: CSize -> IO (Maybe (ForeignPtr Workspace) )
createWorkspace size = do
    ptr <- c_ws_alloc size
    if ptr /= nullPtr
        then do
           foreignPtr <- newForeignPtr c_ws_free ptr
           return $ Just foreignPtr
        else
           return Nothing

We first declare our empty data structure Workspace, just like we did in the previous section.

The gsl_integration_workspace_alloc and gsl_integration_workspace_free functions will no longer be needed in any other file: here, note that the deallocation function is called with an ampersand ("&"), because we do not actually want the function, but rather a pointer to it to set as a finalizer.

The workspace creation function returns a IO (Maybe) value, because there is still the possibility that allocation is unsuccessful and the null pointer is returned. The GSL does not specify what happens if the deallocation function is called on the null pointer, so for safety we do not set a finalizer in that case and return IO Nothing; the user code will then have to check for "Just-ness" of the returned value.

If the pointer produced by the allocation function is non-null, we build a foreign pointer with the deallocation function, inject into the Maybe and then the IO monad. That's it, the foreign pointer is ready for use!

The qag.hsc file must now be modified to use the new module; the parts that change are:

{-# LANGUAGE ForeignFunctionInterface #-}

-- [...]

import GSLWorkSpace

import Data.Maybe(isNothing, fromJust)

-- [...]

qag gauss steps abstol reltol f a b = unsafePerformIO $ do
    c_deactivate_gsl_error_handler
    ws <- createWorkspace (fromIntegral steps)
    if isNothing ws
        then
            return $ Left "GSL could not allocate workspace"
        else do
            withForeignPtr (fromJust ws) $ \workspacePtr -> do

-- [...]

Obviously, we do not need the EmptyDataDecls extension here any more; instead we import the GSLWorkSpace module, and also a couple of nice-to-have functions from Data.Maybe. We also remove the foreign declarations of the workspace allocation and deallocation functions.

The most important difference is in the main function, where we (try to) allocate a workspace ws, test for its Justness, and if everything is fine we use the withForeignPtr function to extract the workspace pointer. Everything else is the same.

Calling Haskell from C[edit | edit source]

Sometimes it is also convenient to call Haskell from C, in order to take advantage of some of Haskell's features which are tedious to implement in C, such as lazy evaluation.

We will consider a typical Haskell example, Fibonacci numbers. These are produced in an elegant, haskellian one-liner as:

fibonacci = 0 : 1 : zipWith (+) fibonacci (tail fibonacci)

Our task is to export the ability to calculate Fibonacci numbers from Haskell to C. However, in Haskell, we typically use the Integer type, which is unbounded: this cannot be exported to C, since there is no such corresponding type. To provide a larger range of outputs, we specify that the C function shall output, whenever the result is beyond the bounds of its integer type, an approximation in floating-point. If the result is also beyond the range of floating-point, the computation will fail. The status of the result (whether it can be represented as a C integer, a floating-point type or not at all) is signalled by the status integer returned by the function. Its desired signature is therefore:

int fib( int index, unsigned long long* result, double* approx )

Haskell Source[edit | edit source]

The Haskell source code for file fibonacci.hs is:

{-# LANGUAGE ForeignFunctionInterface #-}

module Fibonacci where

import Foreign
import Foreign.C.Types

fibonacci :: (Integral a) => [a]
fibonacci = 0 : 1 : zipWith (+) fibonacci (tail fibonacci)

foreign export ccall fibonacci_c :: CInt -> Ptr CULLong -> Ptr CDouble -> IO CInt
fibonacci_c :: CInt -> Ptr CULLong -> Ptr CDouble -> IO CInt
fibonacci_c n intPtr dblPtr
    | badInt && badDouble = return 2
    | badInt              = do
        poke dblPtr dbl_result
        return 1
    | otherwise           = do
        poke intPtr (fromIntegral result)
        poke dblPtr dbl_result
        return 0
    where
    result     = fibonacci !! (fromIntegral n)
    dbl_result = realToFrac result
    badInt     = result > toInteger (maxBound :: CULLong)
    badDouble  = isInfinite dbl_result

When exporting, we need to wrap our functions in a module (it is a good habit anyway). We have already seen the Fibonacci infinite list, so let's focus on the exported function: it takes an argument, two pointers to the target unsigned long long and double, and returns the status in the IO monad (since writing on pointers is a side effect).

The function is implemented with input guards, defined in the where clause at the bottom. A successful computation will return 0, a partially successful 1 (in which we still can use the floating-point value as an approximation), and a completely unsuccessful one will return 2.

Note that the function does not call alloca, since the pointers are assumed to have been already allocated by the calling C function.

The Haskell code can then be compiled with GHC:

ghc -c fibonacci.hs

C Source[edit | edit source]

The compilation of fibonacci.hs has spawned several files, among which fibonacci_stub.h, which we include in our C code in file fib.c:

#include <stdio.h>
#include <stdlib.h>
#include "fibonacci_stub.h"

int main(int argc, char *argv[]) {
    if (argc < 2) {
        printf("Usage: %s <number>\n", argv[0]);
        return 2;
    }

    hs_init(&argc, &argv);

    const int arg = atoi(argv[1]);
    unsigned long long res;
    double approx;
    const int status = fibonacci_c(arg, &res, &approx);

    hs_exit();
    switch (status) {
    case 0:
        printf("F_%d: %llu\n", arg, res);
        break;
    case 1:
        printf("Error: result is out of bounds\n");
        printf("Floating-point approximation: %e\n", approx);
        break;
    case 2:
        printf("Error: result is out of bounds\n");
        printf("Floating-point approximation is infinite\n");
        break;
    default:
        printf("Unknown error: %d\n", status);
    }

    return status;
}

The notable thing is that we need to initialise the Haskell environment with hs_init, which we call passing it the command-line arguments of main; we also have to shut Haskell down with hs_exit() when we are done. The rest is fairly standard C code for allocation and error handling.

Note that you have to compile the C code with GHC, not your C compiler!

ghc -no-hs-main fib.c fibonacci.o -o fib

You can then proceed to test the algorithm:

./fib 42
F_42: 267914296
$ ./fib 666
Error: result is out of bounds
Floating-point approximation: 6.859357e+138
$ ./fib 1492
Error: result is out of bounds
Floating-point approximation is infinite
./fib -1
fib: Prelude.(!!): negative index